python-3.xpandasdataframemeltpandas-melt

How do I melt a pandas with custom nam


I have a table like this

device_type version pool testMean testP50 testP90 testP99 testStd WidgetMean WidgetP50 WidgetP90 WidgetP99 WidgetStd
PNB0Q7 8108162 123 124 136 140.8 141.88 21.35 2.2 0 6.4 9.64 3.92

I want it to be transformed like this:

device_type version pool Name Mean P50 P90 P99 Std
PNB0Q7 8108162 123 test 123 136 140.8 142.88 21.35
PNB0Q7 8108162 123 Widget 2.2 0 6.4 9.64 3.92

I tried using melt but getting:

df.melt(id_vars=["device_type", "version", "pool"], var_name="Name", value_name="Value")
device_type version pool Name Value
PNB0Q7 8108162 test testMean 124.00
PNB0Q7 8108162 test testP50 136.00
PNB0Q7 8108162 test testP90 140.80
PNB0Q7 8108162 test testP99 141.88
PNB0Q7 8108162 test testStd 21.35

Any idea on how to reach to expected solution


Solution

  • You can do this with pd.wide_to_long and a little column naming cleanup first, then reshape:

    df = df.rename(columns={'Std':'testStd',
                            'TestP90':'testP90',
                            'TestP99':'testP99', 
                            'TestP50':'testP50'})
    df_out = pd.wide_to_long(df, 
                             ['test','Widget'], 
                             ['device_type', 'version', 'pool'], 
                             'Measure', '', '.+' )
    df_out = df_out.unstack(-1).stack(0).reset_index()
    df_out
    

    Output:

    Measure device_type  version  pool level_3   Mean    P50    P90     P99    Std
    0            PNB0Q7  8108162   123  Widget    2.2    0.0    6.4    9.64   3.92
    1            PNB0Q7  8108162   123    test  124.0  136.0  140.8  141.88  21.35
    

    Update renaming 'level_3' above:

    df = df.rename(columns={'Std':'testStd',
                            'TestP90':'testP90',
                            'TestP99':'testP99', 
                            'TestP50':'testP50'})
    df_out = pd.wide_to_long(df, 
                             ['test','Widget'], 
                             ['device_type', 'version', 'pool'], 
                             'Measure', '', '.+' )\
                .rename_axis('Instrument', axis=1) #add this line to rename column header axis
    df_out = df_out.unstack(-1).stack(0).reset_index()
    df_out
    

    Output:

    Measure device_type  version  pool Instrument   Mean    P50    P90     P99    Std
    0            PNB0Q7  8108162   123     Widget    2.2    0.0    6.4    9.64   3.92
    1            PNB0Q7  8108162   123       test  124.0  136.0  140.8  141.88  21.35