I'm trying to make a pairplot of kind scatter plot with histogram diagonals, but when adding a hue the histograms become invalid.
My code before hue:
import seaborn as sn
sn.pairplot(dropped_data)
My code after adding hue:
sn.pairplot(dropped_data, hue='damage rating')
What I have tried:
sn.pairplot(dropped_data, hue='damage rating', diag_kind='hist', kind='scatter')
As you can see, when using a hue, the diagonal histogram it goes all weird and becomes incorrect. How can I fix this?
It looks like the hue column is continuous and contains only unique values. As the diagonal is build up of kdeplot
s, those won't work when each kde is build from only one value.
One way to tackle this, is using stacked histplot
s. This might be slow when a lot of data is involved.
Another approach is to make the hue column discrete, e.g. by rounding them.
First, let's try to recreate the problem with easily reproducible data:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
np.random.seed(20220226)
df = pd.DataFrame({f'Sensor {i}': np.random.randn(100) for i in range(1, 4)})
df['damage'] = np.random.rand(100)
sns.pairplot(df, hue="damage")
sns.pairplot(df, hue="damage", diag_kind='hist', diag_kws={'multiple': 'stack'})
df['damage'] = (df['damage'] * 5).round() / 5 # round to multiples of 0.2
sns.pairplot(df, hue="damage")