pythonpandasdataframeselectunset

For each set of 5 columns, drop the 3rd, 4th and 5th columns


I am cleaning a pandas dataframe imported from a .csv. It has useful data in the first and second columns, then junk in columns 3-5. This pattern repeats where every 5th column starting from the first and second columns are useful, and every 5th column starting from the third through fifth are junk. I can remove the junk columns using the code below:

df1 = df.drop(columns=df.columns[4::5])
df1 = df1.drop(columns=df1.columns[3::4])
df1 = df1.drop(columns=df1.columns[2::3])

Is there a solution to do this all in one line?


Solution

  • I think three lines is fine. The code won't get any clearer or faster from putting it all on one line.

    Of course, you can always do:

    columns = df.columns[:]
    df1 = df.drop(columns=columns[4::5]).drop(columns=columns[3::5]).drop(columns=columns[2::5])
    

    which I think also makes it clearer you intend to drop the fifth, fourth and third column every five columns.