I have a DataFrame
with paired columns. I want to plot it such that each pair of columns has a unique color, and one column of each pair has an empty fill.
I tried this:
import pandas as pd
df = pd.DataFrame({
('A', '1'): [1, 2, 3],
('A', '2'): [4, 5, 6],
('B', '1'): [7, 8, 9],
('B', '2'): [10, 11, 12]
})
df.plot.bar(color=['C0', 'none', 'C1', 'none'], edgecolor=['C0', 'C0', 'C1', 'C1'])
This almost works! But it applies the edgecolor
s row-wise instead of column-wise.
I asked ChatGPT to save my butt. It gave me a solution that works (see lightly modified version below), but it's very wordy. My question is, is there a simpler way to do this, ideally using DataFrame.plot
?
import pandas as pd
import matplotlib.pyplot as plt
# Sample DataFrame
df = pd.DataFrame({
('A', '1'): [1, 2, 3],
('A', '2'): [4, 5, 6],
('B', '1'): [7, 8, 9],
('B', '2'): [10, 11, 12]
})
# Define colors for each pair
colors = ['C0', 'C1']
# Create a bar chart
fig, ax = plt.subplots()
# Number of columns
num_cols = len(df.columns)
# Bar width
bar_width = 0.2
# Plot each pair of columns
for i in range(0, num_cols, 2):
color_i = colors[i//2]
ax.bar(df.index + i*bar_width, df.iloc[:, i], bar_width,
label=str(df.columns[i]), color=color_i, edgecolor=color_i)
ax.bar(df.index + (i+1)*bar_width, df.iloc[:, i+1], bar_width,
label=str(df.columns[i+1]), color='none', edgecolor=color_i)
# Add labels, title, and legend
ax.set_xlabel('Index')
ax.set_ylabel('Values')
ax.set_title('Bar chart with paired columns')
ax.set_xticks(df.index + bar_width * (num_cols / 2 - 0.5))
ax.set_xticklabels(df.index)
ax.legend()
Here's one approach:
np.repeat
to repeat each color n times, where n = len(df)
.plt.patches.Patch.set_edgecolor
.plt.legend
afterwards to process the updates there.import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame({
('A', '1'): [1, 2, 3],
('A', '2'): [4, 5, 6],
('B', '1'): [7, 8, 9],
('B', '2'): [10, 11, 12]
})
fig, ax = plt.subplots()
df.plot.bar(ax=ax, color=['C0', 'none', 'C1', 'none'])
edgecolors = np.repeat(['C0', 'C0', 'C1', 'C1'], len(df))
for edgecolor, patch in zip(edgecolors, ax.patches):
patch.set_edgecolor(edgecolor)
plt.legend()
plt.tight_layout()
plt.show()
Plot: