pythonpandasdataframenanffill

How to forward-fill NaN values at a decaying rate in a pandas DataFrame


I try to replace NaN values in a pandas DataFrame with a forward fill method combined with a discount rate or decreasing rate of 0.9.

I have the following data set:

    Column1 Column2 Column3 Column4
0   1.0     5       -9.0    13.0
1   NaN     6       -10.0   15.0
2   3.0     7       NaN     NaN
3   NaN     8       NaN     NaN

For reproducibility:

df1 = pd.DataFrame({
                    'Column1':[1, 'NaN', 3, 'NaN'], 
                    'Column2':[5, 6, 7, 8], 
                    'Column3':[-9, -10, 'NaN', 'NaN'], 
                    'Column4':[13, 15, 'NaN', 'NaN']
                    })
df1 = df1.replace('NaN',np.nan)

I was able to replace the NaN values with the fillna command and the forward fill ffill method.

df2 = df1.fillna(method='ffill')

    Column1 Column2 Column3 Column4
0   1.0 5   -9.0    13.0
1   1.0 6   -10.0   15.0
2   3.0 7   -10.0   15.0
3   3.0 8   -10.0   15.0

Additionally, I am trying to apply the ratio 0.9 to all forward filled NaN values, which would yield the following data set:

NaN value row 2, column 3: -10 * 0.9 = -9

NaN value row 3, column 3: -9 * 0.9 = -8.1

    Column1 Column2 Column3 Column4
0   1.0     5       -9.0    13.00
1   0.9     6       -10.0   15.00
2   3.0     7       -9.0    13.50
3   2.7     8       -8.1    12.15

Is there an easy way to deal with that?

Thanks a lot!


Solution

  • Create an exponent mask by counting consecutive NaN sequences using this groupby/cumsum idea:

    groups = df1.notna().cumsum()
    exp = df1.apply(lambda col: col.isna().groupby(groups[col.name]).cumsum())
    
    #    Column1  Column2  Column3  Column4
    # 0        0        0        0        0
    # 1        1        0        0        0
    # 2        0        0        1        1
    # 3        1        0        2        2
    

    Then ffill and multiply by 0.9 ** exp:

    df2 = df1.ffill().mul(0.9 ** exp)
    
    #    Column1  Column2  Column3  Column4
    # 0      1.0      5.0     -9.0    13.00
    # 1      0.9      6.0    -10.0    15.00
    # 2      3.0      7.0     -9.0    13.50
    # 3      2.7      8.0     -8.1    12.15