pythonpandasdataframehierarchicalfillna

Inplace Forward Fill on a multi-level column dataframe


I have the following dataframe:

arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
 ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
df.loc["B", (slice(None), 'two')]=np.nan

Now, I want to forward fill inplace the data for columns "baz" and "foo" (so not for columns "bar" and "qux"). I tried:

 df[["baz", "foo"]].ffill(inplace=True) 

but the resulting dataframe did not forward fill any of the values. How can I create a dataframe with forward filled data for those two columns only?


Solution

  • I believe the problem comes due to the inplace=True setting. Try accessing the slice with df.loc and then assigning the ffilled dataframe slice back:

    df.loc[:, ["baz", "foo"]] = df[["baz", "foo"]].ffill() 
    

    Output:

    first        baz                 foo          
    second       one       two       one       two
    A       0.465254  0.629161 -0.176656 -1.263927
    B       2.051213  0.629161  1.539584 -1.263927
    C      -0.463592 -0.240445 -0.014090  0.170188
    

    Alternatively, you could use df.fillna(method='ffill'):

    df.loc[:, ["baz", "foo"]] = df[["baz", "foo"]].fillna(method='ffill')