pythonpandasdataframegroup-bypandas-groupby

Preserve group columns/index after applying fillna/ffill/bfill in pandas


I have the data as below, the new pandas version doesn't preserve the grouped columns after the operation of fillna/ffill/bfill. Is there a way to have the grouped data?

data = """one;two;three
1;1;10
1;1;nan
1;1;nan
1;2;nan
1;2;20
1;2;nan
1;3;nan
1;3;nan"""

df = pd.read_csv(io.StringIO(data), sep=";")
print(df)
   one  two  three
0    1    1   10.0
1    1    1    NaN
2    1    1    NaN
3    1    2    NaN
4    1    2   20.0
5    1    2    NaN
6    1    3    NaN
7    1    3    NaN

print(df.groupby(['one','two']).ffill())
   three
0   10.0
1   10.0
2   10.0
3    NaN
4   20.0
5   20.0
6    NaN
7    NaN

Solution

  • With the most recent pandas if we would like keep the groupby columns , we need to adding apply here

    out = df.groupby(['one','two']).apply(lambda x : x.ffill())
    Out[219]: 
       one  two  three
    0    1    1   10.0
    1    1    1   10.0
    2    1    1   10.0
    3    1    2    NaN
    4    1    2   20.0
    5    1    2   20.0
    6    1    3    NaN
    7    1    3    NaN