python-3.xpandasdataframe

Flatten the dataframe in pandas


I want to flatten a dataframe in pandas. This is basically by duplicating the column_names with prefix/suffix of occurence/order of the column and the number of extra columns created should be based on the number of rows.

For example:

`

df = pd.DataFrame({
    'id': [1, 2, 3, 4],
    'A': [10, 20, 30, 40],
    'B': [50, 60, 70, 80],
    'C': [90, 100, 110, 120]
})

print(df)

#    id   A   B    C
# 0   1  10  50   90
# 1   2  20  60  100
# 2   3  30  70  110
# 3   4  40  80  120



#I want something like the following.

print(result_df)

#    id1  A1   B1   C1  id2  A2   B2   C2  id3  A3   B3   C3  id4  A4   B4   C4
# 0    1  10   50   90    2  20   60  100    3  30   70  110    4  40   80  120



`


Solution

  • Try this:

    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame({
        'id': [1, 2, 3, 4],
        'A': [10, 20, 30, 40],
        'B': [50, 60, 70, 80],
        'C': [90, 100, 110, 120]
    })
    
    df_out = df.unstack().to_frame().T.sort_index(level=0)
    df_out.columns = [f'{i}{j+1}' for i, j in df_out.columns]
    print(df_out)
    

    Output:

       id1  id2  id3  id4  A1  A2  A3  A4  B1  B2  B3  B4  C1   C2   C3   C4
    0    1    2    3    4  10  20  30  40  50  60  70  80  90  100  110  120