pythonpandasreplacedrop

Python - How to replace values in a Pandas row based on condition?


Following up on a previous question of mine from the same project, I have a Pandas dataframe sheet that looks something like this:

0 0 0 0 NaN 2022 2022 2022 NaN
1 text1 text2 text3 NaN text4 text5 text6 NaN
2 value1 value2 value3 NaN value4 value5 value6 NaN

Following the second NaN column, the pattern repeats with subsequent years until 2030.

First, I would like to delete the first NaN column while keeping the rest.

Secondly, I would like to then replace all the NaN at index 1 with text7.

Regarding the first problem, I tried the following:

sheet.drop(columns = sheet.columns[3], axis = 1, inplace=True

However, that just dropped EVERY column identical to the one I wanted to drop instead of just the one. I couldn't figure that out, so I just moved onto my second goal with the following:

values_to_replace = {'NaN':'Next Deadline'}
sheet.iloc[1].replace(values_to_replace,inplace=True)

However, that just spits out:

#SettingWithCopyWarning: 
#A value is trying to be set on a copy of a slice from a DataFrame

And nothing in my dataframe changes. I even tried turning that warning off, to no avail.

Help on either of these would be much appreciated as I've spent far too long on them and would like to move on, thank you!


Solution

  • Your first problem is caused by drop dropping all columns called NaN. To work around this, take slices from the dataframe (using the technique in this answer):

    import numpy as np
    
    sheet = sheet.iloc[:, np.r_[:3, 4:len(sheet.columns)]]
    

    For your second issue, the warning is telling you why your dataframe doesn't change: you are setting a value on a copy of a slice from a dataframe (sheet.iloc[1]). You need to assign the result of the operation back to the location (and remove the inplace=True from the replace) i.e.

    sheet.iloc[1] = sheet.iloc[1].replace(values_to_replace)