I have a pandas dataframe:
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
df['Label'] = np.random.randint(0,2,size=100)
I would like to create a figure in python where x-axis shows the class labels ('Class 0' and 'Class 1') and for each class and for a pre-defined variable like 'B' the violin plot (with a box-plot inside) is created.
panda.DataFrame.melt
and then plot with seaborn.violinplot
or seaborn.catplot
.python 3.12.3
, pandas 2.2.2
, seaborn 0.13.2
, matplotlib 3.8.4
import pandas as pd
import seaborn as sns
import numpy as np # for random data
# sample data
np.random.seed(2024)
df = pd.DataFrame(np.random.randint(0, 100, size=(100, 4)), columns=list('ABCD'))
df['Class'] = np.random.randint(0, 2, size=100)
# melt the dataframe to a long form
dfm = df.melt(id_vars='Class', var_name='Group')
# display(dfm.head())
Class Group value
0 0 A 8
1 0 A 36
2 0 A 65
3 1 A 66
4 0 A 74
seaborn.violinplot
ax = sns.violinplot(data=dfm, x='Group', y='value', hue='Class')
sns.move_legend(ax, bbox_to_anchor=(1, 0.5), loc='center left', frameon=False)
x='Group', hue='Class'
x='Class', hue='Group'
seaborn.catplot
seaborn.catplot
with kind='violin'
hue='Group'
or hue='Class'
for color, however, each 'Class'
and 'Group'
is already uniquely identified, so adding color is redundant. Color should only be added if it conveys additional information.g = sns.catplot(kind='violin', data=dfm, x='Class', y='value', col='Group', col_wrap=2)