I have the following dataframe:
Account ID | Subscription type | Cancellation flag |
---|---|---|
123 | Basic | 1 |
222 | Basic | 0 |
234 | Hybrid | 1 |
345 | Hybrid | 1 |
Now I would like to calculate the percentage of cancellations, but grouped by the subscription type. I would like to get it in a format so that I can easily create a bar chart out of the percentages grouped by the subscription type.
Use a groupby.mean
:
out = df.groupby('Subscription type')['Cancellation flag'].mean().mul(100)
Output:
Subscription type
Basic 50.0
Hybrid 100.0
Name: Cancellation flag, dtype: float64
Then plot.bar
:
out.plot.bar()
Or directly with seaborn.barplot
:
import seaborn as sns
sns.barplot(df, x='Subscription type', y='Cancellation flag',
estimator='mean', errorbar=None)
Output: