pythonpandasdataframepandas-melt

Converting a pandas dataframe in wide format to long format


I have a Pandas dataframe in wide format that looks like this:

import pandas as pd

df = pd.DataFrame({'Class_ID': {0: 432, 1: 493, 2: 32},
                   'f_proba_1': {0: 3, 1: 8, 2: 6},
                   'f_proba_2': {0: 4, 1: 9, 2: 9},
                   'f_proba_3': {0: 2, 1: 4, 2: 1},
                   'p_proba_1': {0: 3, 1: 82, 2: 36},
                   'p_proba_2': {0: 2, 1: 92, 2: 96},
                   'p_proba_3': {0: 8, 1: 41, 2: 18},
                   'Meeting_ID': {0: 27, 1: 23, 2: 21}})

df

    Class_ID    f_proba_1   f_proba_2   f_proba_3   p_proba_1   p_proba_2   p_proba_3   Meeting_ID
0   432         3           4           2           3           2           8           27
1   493         8           9           4           82          92          41          23
2   32          6           9           1           36          96          18          21

and I would like to convert to long format:

    Class_ID    Student_ID  f_proba p_proba Meeting_ID
0   432         1           3       3       27
1   432         2           4       2       27
2   432         3           2       8       27
3   493         1           8       82      23
4   493         2           9       92      23
5   493         3           4       41      23
6   32          1           6       36      21
7   32          2           9       96      21
8   32          3           1       18      21

So I have tried .melt in Pandas and here is my code

out = pd.melt(df,
              id_vars = ['Class_ID', 'Meeting_ID'],
              value_vars = ['f_proba_1','f_proba_2','f_proba_3','p_proba_1','p_proba_2','p_proba_3'],
              var_name = 'Student_ID',
              value_name = ['f_proba', 'p_proba'])
out

but it didn't work.


Solution

  • You can use pd.wide_to_long for this:

    out = (pd.wide_to_long(df, 
                           stubnames=['f_proba', 'p_proba'], 
                           i=['Class_ID', 'Meeting_ID'], 
                           j='Student_ID', 
                           sep='_')
           .reset_index()
           )
    

    Output:

       Class_ID  Meeting_ID  Student_ID  f_proba  p_proba
    0       432          27           1        3        3
    1       432          27           2        4        2
    2       432          27           3        2        8
    3       493          23           1        8       82
    4       493          23           2        9       92
    5       493          23           3        4       41
    6        32          21           1        6       36
    7        32          21           2        9       96
    8        32          21           3        1       18