I have a data which looks like
5,960464,6,65430376927058E-10,-91,7689743041992,-89,5235061645508,6,71407200919707E-05,0,00869479635485696
11,92093,5,23110578457515E-10,-92,8140640258789,-90,5685958862305,8,97218165598724E-05,0,011619072933792
17,88139,4,52214777469635E-10,-93,4465560913086,-91,2010803222656,0,000104674258706842,0,0135554304720727
23,84186,4,29985026518504E-10,-93,6654663085938,-91,4199981689453,0,00011655840052398,0,0150944397768688
29,80232,5,20411183436712E-10,-92,8365325927734,-90,591064453125,0,000128133766655091,0,0165934622941289
This data has actually 6 columns. Comma used to separate the data as well as used for floating numbers.
I need to convert above data to the following for further processing.
x, y1, y2, y3, y4, y5
5.960464,6.65430376927058E-10,-91.7689743041992,-89.5235061645508,6.71407200919707E-05,0.00869479635485696
11.92093,5.23110578457515E-10,-92.8140640258789,-90.5685958862305,8.97218165598724E-05,0.011619072933792
17.88139,4.52214777469635E-10,-93.4465560913086,-91.2010803222656,0.000104674258706842,0.0135554304720727
23.84186,4.29985026518504E-10,-93.6654663085938,-91.4199981689453,0.00011655840052398,0.0150944397768688
29.80232,5.20411183436712E-10,-92.8365325927734,-90.591064453125,0.000128133766655091,0.0165934622941289
I've tried to read the file in python and read data from each column which separated by comma. This case is not working, at some rows has more columns.
Can anyone helps me to find a solution ?
The code works, Thank you. But there is an error occurred on lines 432 and other lines of output.
16433.7,08114511022965E-12.-111,498962402344.-109,253494262695.0,000461181910941812.016438.96,5.80785796046257E-12,-112.359840393066,-110.114372253418,0.000461223557779535,0.0597289529135575
16886.6,00392270522813E-12.-112,21565246582.-109,970184326172.0,000463834922107529.016891.96,4.66341618448496E-12,-113.312957763672,-111.067489624023,0.00046386919345512,0.0600715656141191
17488.3,70004369566838E-12.-114,317932128906.-112,072463989258.0,00046732773455058.017493.96,5.22050812530021E-12,-112.822875976563,-110.57740020752,0.000467356179916467,0.0605231341575011
The data before and after the error lines are perfectly ok.
17476.08,6.52454250181715E-12,-111.854499816895,-109.609031677246,0.00046727035202924,0.0605120193526557
17482.04,3.88578861020505E-12,-114.105209350586,-111.859741210938,0.000467303551829094,0.0605163187628847
17499.92,5.28003710011641E-12,-112.773628234863,-110.528160095215,0.000467389661260297,0.0605274700279817
17505.88,4.37629603159924E-12,-113.588935852051,-111.343467712402,0.000467420448688495,0.0605314570334547
Please help me to correct it! Thanks in advance
Since it looks like you only have real numbers, you can replace every other comma:
with (open('data.txt', 'r') as fr,
open('out.csv', 'w') as fw):
fw.write('x,y1,y2,y3,y4,y5\n')
for row in fr:
l = row.split(',')
l = ['.'.join(i) for i in zip(l[::2], l[1::2])]
fw.write(','.join(l))
out.csv
:
x,y1,y2,y3,y4,y5
5.960464,6.65430376927058E-10,-91.7689743041992,-89.5235061645508,6.71407200919707E-05,0.00869479635485696
11.92093,5.23110578457515E-10,-92.8140640258789,-90.5685958862305,8.97218165598724E-05,0.011619072933792
17.88139,4.52214777469635E-10,-93.4465560913086,-91.2010803222656,0.000104674258706842,0.0135554304720727
23.84186,4.29985026518504E-10,-93.6654663085938,-91.4199981689453,0.00011655840052398,0.0150944397768688
29.80232,5.20411183436712E-10,-92.8365325927734,-90.591064453125,0.000128133766655091,0.0165934622941289
Read it with Pandas and check:
import pandas as pd
df = pd.read_csv('out.csv')
>>> df
x y1 y2 y3 y4 y5
0 5.960464 6.654304e-10 -91.768974 -89.523506 0.000067 0.008695
1 11.920930 5.231106e-10 -92.814064 -90.568596 0.000090 0.011619
2 17.881390 4.522148e-10 -93.446556 -91.201080 0.000105 0.013555
3 23.841860 4.299850e-10 -93.665466 -91.419998 0.000117 0.015094
4 29.802320 5.204112e-10 -92.836533 -90.591064 0.000128 0.016593
>>> df.dtypes
x float64
y1 float64
y2 float64
y3 float64
y4 float64
y5 float64
dtype: object