I would like to generate a stacked bar chart (ideally in seaborn), but happy to go with the native pandas plotting functionality. Let me introduce some test data to make things clear.
In [353]: import pandas as pd
In [354]: import seaborn as sns
In [355]: df = pd.DataFrame({"Cat":["A", "B","A","B","A","B","A","B","C"], "Time":[0,0,1,1,0,0,1,1,1], "ID":[0,0,0,0,1,1,1,1,1]})
In [356]: df
Out[356]:
Cat Time ID
0 A 0 0
1 B 0 0
2 A 1 0
3 B 1 0
4 A 0 1
5 B 0 1
6 A 1 1
7 B 1 1
8 C 1 1
In [357]: df.groupby(["ID","Cat"]).count()
Out[357]:
Time
ID Cat
0 A 2
B 2
1 A 2
B 2
C 1
In [358]:
What I would like to see here, is on the x-axis the ID's where on the y axis I get the count (column Time) stacked by the variable Cat, e.g. for ID 1, I want to see a stacked coloured bar where the sizes are 2, 2 and 1. I've tried the following without succeeding:
df.groupby(["ID","Cat"]).count().plot(kind="bar", stacked=True, x="ID")
as it seems it can't handle the multi-axis. Any help much appreciated!
EDIT
This EDIT is to add the trouble I'm having with the legenbox being placed outside of the plotting window. The real code I'm using
p = so.Plot(df.astype({"Time": "category"}),x='Time', color='Category').add(so.Bar(), so.Count(), so.Stack())
fig, ax = plt.subplots(figsize=(2560/120, 1335/120))
today = dt.datetime.today().strftime("%Y%m%d")
plt.grid()
p.on(ax).save(f"{today}_barchart_sources_{c}.png")
plt.close()
You would need to unstack
the Cat
(also only aggregate Time
):
(df.groupby(['ID', 'Cat'])['Time'].count().unstack('Cat')
.plot(kind='bar', stacked=True)
)
Output:
Alternatively, with seaborn's object interface, for which you don't need to pre-aggregate the data. See Stack
for more examples:
import seaborn.objects as so
(so.Plot(df.astype({'ID': 'category'}), x='ID', color='Cat')
.add(so.Bar(), so.Count(), so.Stack())
)
NB. converting ID
to category to avoid having a numeric x-axis.
Output: