I have a pandas dataframe and a list as follows
mylist = ['nnn', 'mmm', 'yyy']
mydata =
xxx yyy zzz nnn ddd mmm
0 0 10 5 5 5 5
1 1 9 2 3 4 4
2 2 8 8 7 9 0
Now, I want to get only the columns mentioned in mylist
and save it as a csv file.
i.e.
yyy nnn mmm
0 10 5 5
1 9 3 4
2 8 7 0
My current code is as follows.
mydata = pd.read_csv( input_file, header=0)
for item in mylist:
mydata_new = mydata[item]
print(mydata_new)
mydata_new.to_csv(file_name)
It seems to me that my new dataframe produces wrong results.Where I am making it wrong? Please help me!
Just pass a list of column names to index df
:
df[['nnn', 'mmm', 'yyy']]
nnn mmm yyy
0 5 5 10
1 3 4 9
2 7 0 8
If you need to handle non-existent column names in your list, try filtering with df.columns.isin
-
df.loc[:, df.columns.isin(['nnn', 'mmm', 'yyy', 'zzzzzz'])]
yyy nnn mmm
0 10 5 5
1 9 3 4
2 8 7 0