pythonpandasfillna

Replace NaN by 0, starting after first non-nan value per row


I have a pandas DataFrame in which I need to populate the NaN values by 0 starting from the first non-nan numeric value in each row.

It is close to this post, but doing the operation per row and not per column: How to populate NaN by 0, starting after first non-nan value

Example df

   name day1 day2 day3 day4
0 Alice  NaN  5.0  NaN  6.0
1   Bob  NaN  NaN  3.0  2.0
2 Carol  3.0  NaN  NaN  4.0

Desired output

   name day1 day2 day3 day4
0 Alice  NaN  5.0  0.0  6.0
1   Bob  NaN  NaN  3.0  2.0
2 Carol  3.0  0.0  0.0  4.0

I tried to modify the method from linked question, but also some workaround this post, unfortunately didn't managed to get through.

Thanks!


Solution

  • You can craft a boolean mask with notna+cummax:

    cols = list(df.filter(like='day'))
    # or
    # cols = ['day1', 'day2', 'day3', 'day4']
    
    df[cols] = df[cols].fillna(0).where(df[cols].notna().cummax(axis=1))
    

    Output:

        name  day1  day2  day3  day4
    0  Alice   NaN   5.0   0.0   6.0
    1    Bob   NaN   NaN   3.0   2.0
    2  Carol   3.0   0.0   0.0   4.0
    

    Intermediate:

    # df[cols].notna().cummax(axis=1)
    
        day1   day2  day3  day4
    0  False   True  True  True
    1  False  False  True  True
    2   True   True  True  True
    

    Or adapting the linked approach be used with rows and ignoring the non-day column (which I don't like as much as the above):

    df[cols] = df[cols].fillna(df[cols].mask(df[cols].ffill(axis=1).notna(), 0), axis=1)