I have a text file with 8 columns
# 0.y[m] 1.uy[m_e*c] 2.x[m] 3.ux[m_ec] 4.z[m] 5.uz[m_ec] 6.w 7.ID
.
In this file, the first line is commented with #
and there is no label for the data columns. I import this file in a data frame using
data = pd.DataFrame(pd.read_csv("file.txt", sep='\t', comment='#', header=None));
and diplaying it I see:
Now, I need to sort ascending for the last column so I use index 7 and this code line
dataS = data.sort_values(7);
and sorting works because now I see
but sorting is not persistent because
data[7][0] = 24641553
and
dataS[7][0] = 24641553
I need to use the sorted data frame one row after the other, exactly in the sorted order, so my code will rely on a for loop which uses dataS[7][i]
where i = 0, 1, 2, ...
Code is below.
import pandas as pd;
data = pd.DataFrame(pd.read_csv("file.txt", sep='\t', comment='#', header=None));
dataS = data.sort_values(7);
Sample text file looks like this:
#0.y[m] 1.uy[m_e*c] 2.x[m] 3.ux[m_ec] 4.z[m] 5.uz[m_ec] 6.w 7.ID
4.800773e-06 5.825619e+00 9.693396e-06 1.732705e+00 1.068944e-05 -3.532225e+00 1.255580e+04 24641553
4.359847e-06 1.275340e+01 9.564333e-06 -3.591681e-01 9.690643e-06 7.398885e+00 1.255580e+04 18676620
My problem is that I don't know how to cycle on the sorted data frame labeled by dataS
here. Can anyone help please? Thanks!
The problem is not that the sorting isn't persistent, it's that it's also sorting the index:
dataS = data.sort_values(7)
0 1 2 ... 5 6 7
1 0.000004 12.753400 0.00001 ... 7.398885 12555.8 18676620
0 0.000005 5.825619 0.00001 ... -3.532225 12555.8 24641553
If you want to sort only the values, but not the index, use ignore_index=True
:
dataS = data.sort_values(7, ignore_index=True)
0 1 2 ... 5 6 7
0 0.000004 12.753400 0.00001 ... 7.398885 12555.8 18676620
1 0.000005 5.825619 0.00001 ... -3.532225 12555.8 24641553
dataS[7][0]
will output:
18676620