pythonpandasmatplotlibstackedbarseries

How to create stacked bar chart with a time series and aggregated values


I am having difficulty creating a stacked bar chart time series from my Pandas dataframe (image below). I would like to have the 'Date' on the x axis, the 'Hours' on the y axis, and each bar to show the time spent with each group in 'Category'.

Sample dataframe data

Do I need to use Pandas - Groupby function? The dataframe is a sample. I have hundreds of rows of data from 2018 to 2020.


Solution

  • Imports and Data Transformation

    import pandas as pd
    import matplotlib.pyplot as plt
    import random  # for test data
    import numpy as np  # for test data
    
    # setup dataframe with test data
    np.random.seed(365)
    random.seed(365)
    rows = 1100
    data = {'hours': np.random.randint(10, size=(rows)),
            'group': [random.choice(['A', 'B', 'C']) for _ in range(rows)],
            'date': pd.bdate_range('2020-11-24', freq='h', periods=rows).tolist()}
    df = pd.DataFrame(data)
    
    # display(df.head())
       hours group                date
    0      2     C 2020-11-24 00:00:00
    1      4     B 2020-11-24 01:00:00
    2      1     C 2020-11-24 02:00:00
    3      5     A 2020-11-24 03:00:00
    4      2     B 2020-11-24 04:00:00
    
    # use groupby on df
    dfg = df.groupby([df.date.dt.date, 'group'])['hours'].sum().reset_index()
    
    # pivot the dataframe into the correct format
    dfp = dfg.pivot(index='date', columns='group', values='hours')
    
    # display(dfp.head())
    group        A   B   C
    date                  
    2020-11-24  49  25  29
    2020-11-25  62  18  57
    2020-11-26  42  77   4
    2020-11-27  34  43  17
    2020-11-28  28  53  23
    
    dfp = df.pivot_table(index=df.date.dt.date, columns='group', values='hours', aggfunc='sum')
    

    Plot

    # plot the pivoted dataframe
    dfp.plot.bar(stacked=True, figsize=(10, 6), ylabel='Hours', xlabel='Date', title='Sum of Daily Category Hours')
    plt.legend(title='Category', bbox_to_anchor=(1.05, 1), loc='upper left')
    plt.show()
    

    enter image description here

    dfp.plot.barh(stacked=True, figsize=(6, 10), title='Sum of Daily Category Hours')
    plt.legend(title='Category', bbox_to_anchor=(1.05, 1), loc='upper left')
    plt.xlabel('Hours')
    plt.ylabel('Date')
    plt.show()
    

    enter image description here

    dfp.plot(figsize=(10, 6))
    plt.show()
    

    enter image description here