Somehow I have trouble in getting the right result from a countplot. Let's look at the following dummy data
In [111]: import pandas as pd
In [112]: import seaborn as sns
In [113]: import numpy as np
In [114]: data = pd.DataFrame({"A": [np.nan, np.nan, 2], "Cat": [0,1,0], "x":["l", "n", "k"]})
In [115]: data
Out[115]:
A Cat x
0 NaN 0 l
1 NaN 1 n
2 2.0 0 k
In [116]: sns.countplot(data=data, x="x", hue="Cat")
I would expect bars for l
and n
to be zero while for k
to show a one. However, my countplot shows everywhere a one. What I'm doing wrongly? I would like to have the counts over column A
A countplot
will count the number of occurrences per x
, it looks like you rather want a barplot
after pre-aggregating the data:
sns.barplot(data=data.assign(A=data['A'].notna())
.groupby(['x', 'Cat'], as_index=False, sort=False)
.sum(),
x='x', y='A', hue='Cat')
Output:
If you want to use a countplot
, you could also convert the x/Cat to category and dropna
:
sns.countplot(data=data.astype({'x': 'category', 'Cat': 'category'})
.dropna(subset='A'), x='x', hue='Cat')
Output: