unixscientific-notation

How can I use plink/Unix to convert data from scientific notation to decimal?


I work with genetic data. I just found a supercomputer to help with genetic analysis, but I need to convert the data to exactly the format the super computer wants: two columns, one with chromosome information and one with p-value. The p-value column must not have any letters, but some of the data I have is in scientific notation, like so:

rs191895619 1.052e-05
rs140779862 0.4406
rs11127542 0.9771
rs112183333 0.02569
rs191067167 0.427
rs111321342 1.042e-05

which puts several E's in the column that must not have letters in it.

I tried to use grep to move them into their own file using grep "*e*" filename.txt > outputfilename.txt as well as grep "*e-05" filename.txt > outputfilename.txt but it gave me a blank output file both times, and even if all 5000 lines of scientifically notated data had moved into their own file, I don't know how to change the data to decimal notation except by editing each line individually, which would take several days for each file.

Is there a command I can give to plink so that the data it gives me is not in scientific notation in the first place? Or a command I can use in plink or Unix to convert the scientific notation I have into decimal notation?


Solution

  • You can use awk to convert scientific to decimal:

    awk '{printf "%s %f\n", $1, $2}' file
    

    Outputs:

    rs191895619 0.000011
    rs140779862 0.440600
    rs11127542 0.977100
    rs112183333 0.025690
    rs191067167 0.427000
    rs111321342 0.000010
    

    You can adjust the precision by changing %f part in printf.


    See also: