pythonpandasmatplotlibplot

pandas bar chart with paired columns


I have a DataFrame with paired columns. I want to plot it such that each pair of columns has a unique color, and one column of each pair has an empty fill.

I tried this:

import pandas as pd

df = pd.DataFrame({
    ('A', '1'): [1, 2, 3],
    ('A', '2'): [4, 5, 6],
    ('B', '1'): [7, 8, 9],
    ('B', '2'): [10, 11, 12]
})

df.plot.bar(color=['C0', 'none', 'C1', 'none'], edgecolor=['C0', 'C0', 'C1', 'C1'])

This almost works! But it applies the edgecolors row-wise instead of column-wise.

Bar chart with paired columns (wrong)

I asked ChatGPT to save my butt. It gave me a solution that works (see lightly modified version below), but it's very wordy. My question is, is there a simpler way to do this, ideally using DataFrame.plot?

import pandas as pd
import matplotlib.pyplot as plt

# Sample DataFrame
df = pd.DataFrame({
    ('A', '1'): [1, 2, 3],
    ('A', '2'): [4, 5, 6],
    ('B', '1'): [7, 8, 9],
    ('B', '2'): [10, 11, 12]
})

# Define colors for each pair
colors = ['C0', 'C1']

# Create a bar chart
fig, ax = plt.subplots()

# Number of columns
num_cols = len(df.columns)

# Bar width
bar_width = 0.2

# Plot each pair of columns
for i in range(0, num_cols, 2):
    color_i = colors[i//2]
    ax.bar(df.index + i*bar_width, df.iloc[:, i], bar_width, 
           label=str(df.columns[i]), color=color_i, edgecolor=color_i)
    ax.bar(df.index + (i+1)*bar_width, df.iloc[:, i+1], bar_width, 
           label=str(df.columns[i+1]), color='none', edgecolor=color_i)

# Add labels, title, and legend
ax.set_xlabel('Index')
ax.set_ylabel('Values')
ax.set_title('Bar chart with paired columns')
ax.set_xticks(df.index + bar_width * (num_cols / 2 - 0.5))
ax.set_xticklabels(df.index)
ax.legend()

Bar chart with paired columns


Solution

  • Here's one approach:

    import matplotlib.pyplot as plt
    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame({
        ('A', '1'): [1, 2, 3],
        ('A', '2'): [4, 5, 6],
        ('B', '1'): [7, 8, 9],
        ('B', '2'): [10, 11, 12]
    })
    
    fig, ax = plt.subplots()
    
    df.plot.bar(ax=ax, color=['C0', 'none', 'C1', 'none'])
    
    edgecolors = np.repeat(['C0', 'C0', 'C1', 'C1'], len(df))
    
    for edgecolor, patch in zip(edgecolors, ax.patches):
        patch.set_edgecolor(edgecolor)
        
    plt.legend()
    plt.tight_layout()
    plt.show()
    

    Plot:

    plot with correct edgecolors