first of all I apologize for my english and thanks for your time.
I´ve a problem with the labels from df or series to draw a catplot with seaborn.
I have a df like this (from a data that which was modified with pd.melt)
cardio variable value
0 0 cholesterol 0
1 1 cholesterol 1
2 1 cholesterol 1
3 1 cholesterol 0
4 0 cholesterol 0
... ... ... ...
419995 0 overweight 1
419996 1 overweight 1
419997 1 overweight 1
419998 1 overweight 1
419999 0 overweight 0
And i need to draw a sns.catplot with that data grouped by 'cardio' and 'variable', and then counted by value. So, I wrote this code:
df_cat = df_cat.groupby(['cardio','variable']).value_counts()
df_cat2=df_cat.to_frame()
The problem is that its returns a df with 2 levels of labels (the top label its the '0') like this:
0
cardio variable value
0 active 1 28643
0 6378
alco 0 33080
1 1941
cholesterol 0 29330
1 5691
As sns.catplot needs to use dataframe and correctly recognize column names, this '0' column is causing problems to create the catplot. I need to rename the columns names and remove this '0' label from the last df or name the counts column when i use groupby.value_counts() in the first df because I think that the '0' its created automatically since the 'counts' column has no name.
I expect something like this:
cardio variable value count
0 active 1 28643
0 6378
alco 0 33080
1 1941
cholesterol 0 29330
1 5691
Value_counts returns a series with a multi-index. Just reset the index and rename the column--fake data added below in example.
import pandas as pd
import numpy as np
n = 50
cats = ['cholestoral', 'active', 'alco']
data = {'cardio': np.random.randint(2, size=n),
'variable': np.random.choice(cats, size=n),
'value':np.random.randint(2, size=n)}
df = pd.DataFrame.from_dict(data)
df_plot = (df
.value_counts(subset=['cardio','variable', 'value'])
.reset_index()
.rename(columns={0:'counts'})
)