pythonpandasmulti-index

How to stack a dataframe with multiindexed columns but to the outer level of the index


I have a dataframe of 40000 row and 64 columns. The columns are indexed with a 2 level index (16 labels in level 0 and 4 in level 1), while the rows have a simple index. I want the columns labels of level 0 to be transformed into row labels of level 0.

I have tried the .stack method but it transforms columns labels of level 0 into row labels of level 1, which is not what I want.

Below the minimal reproducible example.

import pandas as pd
import numpy as np

# This is the intial dataframe.
df_init = pd.DataFrame( np.linspace(0, 89, 90, dtype='int').reshape(10, 9), columns=pd.MultiIndex.from_product([['A', 'B', 'C'], ['a', 'b', 'c']]))

# This is an intermediate transformation.
df1 = df_init.stack(level=0)

# "df_final" is what I want.
df_final = pd.concat([df1.xs(key='A', level=1), df1.xs(key='B', level=1), df1.xs(key='C', level=1)], keys=['A', 'B', 'C'])

Is there any pandas idiom to have the desired result without using .xs which, in this case, is inconvenient?


Solution

  • You can use df.unstack and again for only level=1:

    out = df_init.unstack().unstack(level=1)
    

    Alternatively, use df.stack + df.swap_level. But you will have to add df.sort_index here:

    out2 = df_init.stack(level=0, future_stack=True).swaplevel().sort_index(level=0)
    

    Equality check:

    all(x.equals(df_final) for x in [out, out2])
    # True