pythonmatplotlibseabornkdeplot

Unable to read data from kdeplot


I have a pandas dataframe with two columns, A and B, named df in the following bits of code.

And I try to plot a kde for each value of B like so:

import seaborn as sbn, numpy as np, pandas as pd

fig = plt.figure(figsize=(15, 7.5))
sbn.kdeplot(data=df, x="A", hue="B", fill=True)
fig.savefig("test.png")

I read the following propositions but only those where I compute the kde from scratch using statsmodel or some other module get me somewhere: Seaborn/Matplotlib: how to access line values in FacetGrid? Get data points from Seaborn distplot

For curiosity's sake, I would like to know why I am unable to get something from the following code:

kde = sns.kdeplot(data=df, x="A", hue="B", fill=True)
line = kde.lines[0]
x, y = line.get_data()
print(x, y)

The error I get is IndexError: list index out of range. kde.lines has a length of 0.

Accessing the lines through fig.axes[0].lines[0] also raises an IndexError.

All in all, I think I tried everything proposed in the previous threads (I tried switching to displot instead of using kdeplot but this is the same story, only that I have to access axes differently, note displot and not distplot because it is deprecated), but every time I get to .get_lines(), ax.lines, ... what is returned is an empty list. So I can't get any values out of it.

EDIT : Reproducible example

import pandas as pd, numpy as np, matplotlib.pyplot as plt, seaborn as sbn

# 1. Generate random data
df = pd.DataFrame(columns=["A", "B"])

for i in [1, 2, 3, 5, 7, 8, 10, 12, 15, 17, 20, 40, 50]:
    for _ in range(10):
        df = df.append({"A": np.random.random() * i, "B": i}, ignore_index=True)

# 2. Plot data
fig = plt.figure(figsize=(15, 7.5))
sbn.kdeplot(data=df, x="A", hue="B", fill=True)

# 3. Read data (error)
ax = fig.axes[0]
x, y = ax.lines[0].get_data()
print(x, y)

Solution

  • This happens because using fill=True changes the object that matplotlib draws.

    When no fill is used, lines are plotted:

    fig = plt.figure(figsize=(15, 7.5))
    ax = sbn.kdeplot(data=df, x="A", hue="B")
    print(ax.lines)
    # [<matplotlib.lines.Line2D object at 0x000001F365EF7848>, etc.]
    

    when you use fill, it changes them to PolyCollection objects

    fig = plt.figure(figsize=(15, 7.5))
    ax = sbn.kdeplot(data=df, x="A", hue="B", fill=True)
    print(ax.collections)
    # [<matplotlib.collections.PolyCollection object at 0x0000016EE13F39C8>, etc.]
    

    You could draw the kdeplot a second time, but with fill=False so that you have access to the line objects