[SOLVED] Why the behaviour of set_levels is unpredictable?

Why the behaviour of set_levels is unpredictable?

I have this dataframe that is read from excel file :

import pandas as pd
import numpy as np

columns = pd.MultiIndex.from_product([['A', 'B'], ['x', 'y', 'z']])
df = pd.DataFrame(np.arange(1, len(columns)*5+1).reshape(-1, len(columns)), index=range(5), columns=columns)

print(df)

    A           B        
    x   y   z   x   y   z
0   1   2   3   4   5   6
1   7   8   9  10  11  12
2  13  14  15  16  17  18
3  19  20  21  22  23  24
4  25  26  27  28  29  30

I need to make transformation to have unique columns names in the bottom level of columns.

I made the new columns and use set_levels but the result is weird. It takes only the left part of new_columns. And sometimes in other datasets set_levels has stricly no effect. https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.set_levels.html

new_columns = ['x1', 'y1', 'z1'] + ['x2', 'y2', 'z2']

df.columns = df.columns.set_levels(new_columns, level=1)

print(df)

    A           B        
   x1  y1  z1  x1  y1  z1
0   1   2   3   4   5   6
1   7   8   9  10  11  12
2  13  14  15  16  17  18
3  19  20  21  22  23  24
4  25  26  27  28  29  30

Am I missing something guys ?

Solution

set_levels is not doing what you think it is.

Indices are categorical values, they are encoded as integers. You can't pass a list of new values.

In fact, if you check the codes of your indices:

df.columns.codes

FrozenList([[0, 0, 0, 1, 1, 1],
            [0, 1, 2, 0, 1, 2]])

You will see that 0, 1, 2 in the level 1 are duplicated, thus passing ['x1', 'y1', 'z1', 'x2', 'y2', 'z2'], only the first 3 values are used.

You must craft a new MultiIndex:

data = list(zip(*df.columns.to_list()))
# or, shortcut
# data = list(zip(*df))
data[1] = new_columns

df.columns = pd.MultiIndex.from_arrays(data)

Output:

    A           B        
   x1  y1  z1  x2  y2  z2
0   1   2   3   4   5   6
1   7   8   9  10  11  12
2  13  14  15  16  17  18
3  19  20  21  22  23  24
4  25  26  27  28  29  30