pythonpandasmatplotlibstacked-bar-chartplot-annotations

Display totals and percentage in stacked bar chart using DataFrame.plot


My data frame looks like below:

Airport ATA Cost Destination Handling Custom Total Cost
PRG 599222 11095 20174 630491
LXU 364715 11598 11595 387908
AMS 401382 23562 16680 441623
PRG 599222 11095 20174 630491

Using below codes it gives a stacked bar chart:

import pandas as pd

# sample dataframe
data = {'Airport': ['PRG', 'LXU', 'AMS', 'PRG'],
        'ATA Cost': [599222, 364715, 401382, 599222],
        'Destination Handling': [11095, 11598, 23562, 11095],
        'Custom': [20174, 11595, 16680, 20174],
        'Total Cost': [630491, 387908, 441623, 630491]}
df = pd.DataFrame(data)

# plot columns without Total Cost
df.iloc[:, :-1].plot(x='Airport', kind='barh', stacked=True, title='Breakdown of Costs', mark_right=True)    

enter image description here

How to add the totals (separated by thousands 1,000) over each stacked bar chart? How to add % for each segments in the stacked bar chart?


Solution

  • You can use plt.text to place the information at the positions according to your data.

    However, if you have very small bars, it might need some tweaking to look perfect.

    df_total = df['Total Cost']
    df = df.iloc[:, 0:4]
    df.plot(x = 'Airport', kind='barh',stacked = True, title = 'Breakdown of Costs', mark_right = True)
    
    df_rel = df[df.columns[1:]].div(df_total, 0)*100
    
    for n in df_rel:
        for i, (cs, ab, pc, tot) in enumerate(zip(df.iloc[:, 1:].cumsum(1)[n], df[n], df_rel[n], df_total)):
            plt.text(tot, i, str(tot), va='center')
            plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center')
    

    enter image description here

    EDIT: Some arbitrary ideas for better readability:

    shift the total values to the right, use 45° rotated text:

        plt.text(tot+10000, i, str(tot), va='center')
        plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center', rotation=45)
    

    enter image description here

    switch between top- and bottom-aligned text:

    va = ['top', 'bottom']
    va_idx = 0
    for n in df_rel:
        va_idx = 1 - va_idx
        for i, (cs, ab, pc, tot) in enumerate(zip(df.iloc[:, 1:].cumsum(1)[n], df[n], df_rel[n], df_total)):
            plt.text(tot+10000, i, str(tot), va='center')
            plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va=va[va_idx], ha='center')
    

    enter image description here

    label only bars with 10% or more:

    if pc >= 10:
        plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center')
    

    enter image description here

    ...or still print them, but vertical:

    if pc >= 10:
        plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center')
    else:
        plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center', rotation=90)
    

    enter image description here