pythonseabornkdeplotstripplot

How to align KDE Plot with Strip Plot in Seaborn?


I am encountering difficulties adjusting the Seaborn KDE plot to properly align it with a strip plot in my visualization. Despite attempting various methods, such as modifying bw_adjust and manually scaling the KDE plot, I have not been able to achieve the desired outcome.

I aim to plot the KDE and strip plot for a column in my dataframe to visualize the data distribution as well as visualize the data points with respect to the KDE. The strip plot is also plotted with a hue to categorize the data.

First, if I plot the strip plot alone as follows:

sns.stripplot(data=df, x="values", hue="category")

I get the following plot: stripplot alone

And, if I plot the KDE plot alone as follows:

sns.kdeplot(data=df, x="values", ax=ax)

I get the following plot: kde plot alone

However, if I try to combine both as follows:

fig, ax = plt.subplots(figsize=(16, 8))
sns.stripplot(data=df, x="values", hue="category")
sns.kdeplot(data=df, x="values", ax=ax)

I get the following plot: kde and strip plots together

For some reason, the KDE plot becomes inverted when plotted with the strip plot and I am not sure why this is happeneing. So I have 2 problems I am trying to solve:

  1. Inverting the KDE plot so that it is visualized as the case where it was plotted alone.
  2. Scale the KDE plot (or strip plot) properly along the y-axis so that the data points can be well visualized with respect to the KDE plot.

First, I tried to change some parameters in the seaborn stripplot and kdeplot functions as follows:

sns.stripplot(data=df, x="values", hue="category", jitter=0.1, dodge=True, ax=ax)
sns.kdeplot(data=df, x="values", ax=ax, bw_adjust=-1)

And, it appears that setting bw_adjust to a negative number solves the first issue as you can see in the result below (even though I am not sure why this works) : first try

However, the data points of the strip plot are still not well represented and I am not sure how to scale the KDE plot properly so that it fits well with the strip plot data.

So, to try to solve this issue, I tried plotting the KDE plot manually (without seaborn):

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from scipy.stats import gaussian_kde


df = pd.read_csv("inputs/desired_data.csv")
fig, ax = plt.subplots(figsize=(16, 8))
# Strip plot
sns.stripplot(data=df, x="values", hue="category", jitter=0.1, dodge=True, ax=ax)

# Computing KDE manually
x = df["values"].values
kde = gaussian_kde(x, bw_method=1)
x_grid = np.linspace(x.min(), x.max(), 1000)
y = kde.evaluate(x_grid)

# Normalize and scale the KDE plot
y_scaled = y / y.max()  # Example normalization, adjust as needed

# Plot the scaled KDE
ax.plot(x_grid, y_scaled, color='blue')

And, with this code, I get the following plot: Plot with manual KDE 1

So I got a better results in terms of visualizing the data points with stripplot. However, I got an inverted KDE plot again. And, if I change bw_method=1 to bw_method=-1 in kde = gaussian_kde(x, bw_method=1), I get the following plot: Plot with manual KDE 2

So, the results get worse again. Ideally, I would like to obtain a plot as follows: enter image description here


Solution

  • You are trying to plot a somewhat unusual combination. The stripplot inverts the y-axis, and sets the limits between -0.5 and 0.5. The kdeplot sets the minimum of the y axis to 0 (so it "sits" on the x-axis), and the height is such that the area under the curve is normalized to be 1.

    The easiest approach would be to use a twin axis. By drawing the kdeplot on the original axis, the left y-axis will show the height of the kdeplot. stripplot can be on the twin axis, with an independent y.

    Here is some example code:

    import  seaborn as sns
    import matplotlib.pyplot as plt
    iris = sns.load_dataset('iris')
    fig, ax = plt.subplots()
    sns.kdeplot(iris, x='sepal_width', ax=ax)
    ax2 = ax.twinx()
    sns.stripplot(iris, x='sepal_width', hue='species', ax=ax2, palette='turbo', dodge=True)
    plt.tight_layout()
    plt.show()
    

    combining kdeplot and stripplot