python-3.xpandasdataframestacked

Dealing with Merged Tables that are Stacked


I imported a csv via Pandas read_table which is essentially a stacked column where each student is named and then values found in the following rows.

Student-John
01/01/2021 334
01/02/2021 456
Student-Sally
01/01/2021 76
01/04/2021 789

I want to pivot these so that each student has their own column and dates are on the left hand side.

Date Student-Jon Student-Sally
01/01/2021 334 76
01/02/2021 456
01/04/2021 789

My approach is to bring in CSV via pandas dataframe.

import pandas as pd
df = pd.read_table('C:/Users/****data.csv', skiprows=1, header=None)
df[2]=""
df.columns = "Date", "Val"

x="Start"

#Started with this although the Student line doesn't work

for ind, row in df.iterrows():
    if df['Date'][ind] == "Student*":
        x = df['Date'][ind]
        df.drop(ind, inplace=True)
    else:
        df['Val'][ind] = x 

Solution

  • Use boolean mask to filter out your dataframe then pivot to reshape it:

    # Rename columns
    df.columns = ['Date', 'Value']
    
    # Find Student rows
    m = df[0].str.startswith('Student')
    
    # Create the future column
    df['Student'] = df[0].mask(~m).ffill()
    
    # Remove Student rows
    df = df[~m]
    
    # Reshape your dataframe
    df = df.pivot('Date', 'Student', 'Value').rename_axis(columns=None).reset_index()
    

    Output:

    >>> df
             Date Student-John Student-Sally
    0  01/01/2021          334            76
    1  01/02/2021          456           NaN
    2  01/04/2021          NaN           789
    

    Setup:

    import pandas as pd
    import numpy as np
    
    data = {0: ['Student-John', '01/01/2021', '01/02/2021',
                'Student-Sally', '01/01/2021', '01/04/2021'],
            1: [np.nan, '334', '456', np.nan, '76', '789']}
    df = pd.DataFrame(data)
    print(df)
    
    # Output
                   0    1
    0   Student-John  NaN
    1     01/01/2021  334
    2     01/02/2021  456
    3  Student-Sally  NaN
    4     01/01/2021   76
    5     01/04/2021  789