pandasdataframeindexinggroup-byinfluxdb

Changing a pandas dataframe format into another format?


The given dataframe looks like this:

dataframe.head()

                     sensorA  sensorB  deviceA  deviceB  inputA  inputB  machineA  machineB  flagA  flagB  mainA
Time                                                                                                            
2021-11-26 20:20:00    379.0      0.0      0.0    489.0    0.77    35.0       0.0      51.0  -13.0  230.0    1.6
2021-11-26 20:30:00    344.0      0.0      0.0    143.0    0.76    31.0       0.0      50.0  -11.0  230.0    1.8

I want to map this to a the following format separating the individual columns into a combination of field and attribute.

Time Type attribute Value
2021-11-26 20:20:00 sensor a 999

I have tried multiple directions to approach this using multi indexing, groupby etc but cant seem to get around on how to exactly implement this ?

Any help would be appreciated!!


Solution

  • Edit

    If your column names contain '_' as separator, you can use:

    df.columns = df.columns.str.split('_', expand=True).rename(['Type', 'Tag'])
    out = df.unstack().rename('Value').reset_index(level=['Type', 'Tag']).sort_index()
    

    Extract type/tag from column names with a regular expression:

    types = ['sensor', 'device', 'input', 'machine', 'flag', 'main']
    pat = fr"({'|'.join(types)})(.*)"
    
    df.columns = pd.MultiIndex.from_frame(df.columns.str.extract(pat), 
                                          names=['Type', 'Tag'])
    out = df.unstack().rename('Value').reset_index(level=['Type', 'Tag']).sort_index()
    

    Output:

    >>> out
                            Type Tag   Value
    Time                                    
    2021-11-26 20:20:00   sensor   A  379.00
    2021-11-26 20:20:00     flag   B  230.00
    2021-11-26 20:20:00     flag   A  -13.00
    2021-11-26 20:20:00  machine   B   51.00
    2021-11-26 20:20:00  machine   A    0.00
    2021-11-26 20:20:00     main   A    1.60
    2021-11-26 20:20:00    input   A    0.77
    2021-11-26 20:20:00    input   B   35.00
    2021-11-26 20:20:00   device   B  489.00
    2021-11-26 20:20:00   device   A    0.00
    2021-11-26 20:20:00   sensor   B    0.00
    2021-11-26 20:30:00    input   A    0.76
    2021-11-26 20:30:00   device   A    0.00
    2021-11-26 20:30:00    input   B   31.00
    2021-11-26 20:30:00  machine   A    0.00
    2021-11-26 20:30:00   sensor   B    0.00
    2021-11-26 20:30:00  machine   B   50.00
    2021-11-26 20:30:00     flag   A  -11.00
    2021-11-26 20:30:00   sensor   A  344.00
    2021-11-26 20:30:00     flag   B  230.00
    2021-11-26 20:30:00   device   B  143.00
    2021-11-26 20:30:00     main   A    1.80