pythonseabornkdeplotseaborn-objects

Use seaborn object interface to plot overlapping density plots, added inside a for loop, each having its own color/label shown in a legend


Using seaborn python library, I am trying to make several density plots overlapping each other in the same figure and I want to color/label each of the lines. Using seaborn objects interface I am able to make the density plots within a for loop. But I cannot add color/label to each density plot.

I understand that there are other ways e.g., I create a dataframe with all the data and corresponding labels first and then pass it to seaborn plot(). But I was just wondering if below code (using seaborn objects interface) could work with some modifications. Please advise.

Code:

Here I am setting color=s_n which is the number of samples that I drew from the normal distribution. I want to label each density plot with the number of samples (please also the see the desired plot towards the end of post)

import scipy.stats as st
import seaborn.objects as so

num_samples = 2000
normal_distr = st.norm(1,1)

sp = so.Plot()

for s_n in range(10,num_samples,400):
    sample_normal = normal_distr.rvs(s_n)
    sp = sp.add(so.Line(),so.KDE(),x=sample_normal,color=s_n)
sp.show()

The plots looks like this and it does not color/label each density line separately.

KDE plot without individual color for each density

If I directly use seaborn kdeplot, I can get the desired plot (below). But I was just wondering if I can use seaborn objects instead of direct kdeplot

Code using kdeplot:

import scipy.stats as st
import seaborn as sns
import matplotlib.pyplot as plt

num_samples = 2000
normal_distr = st.norm(1,1)

for s_n in range(10,num_samples,400):
    sample_normal = normal_distr.rvs(s_n)
    sns.kdeplot(x=sample_normal, label=s_n)    
plt.legend()

The (desired) plot:

KDE plot with individual color for each density


Solution

  • I guess the trick here would be to prepare your df so that you can forgo the loop and use the color kwarg as it's meant to be used:

    import scipy.stats as st
    import seaborn.objects as so
    import pandas as pd
    
    num_samples = 2000
    normal_distr = st.norm(1,1)
    
    df = pd.concat([
        pd.DataFrame(
            {'sn': str(s_n),
            'values': normal_distr.rvs(s_n)}
            )
        for s_n in range(10,num_samples,400)
    ])
    

    This would look like this:

            sn    values
    0       10  0.976926
    1       10 -0.501831
    2       10  1.748071
    3       10  0.968493
    4       10  0.593531
    ...    ...       ...
    1605  1610  0.311484
    1606  1610  1.332424
    1607  1610  1.531519
    1608  1610  1.240953
    1609  1610 -0.793144
    

    Then printing can be done in a single line:

    so.Plot(df, x='values').add(so.Line(), so.KDE(common_norm=False), color='sn').show()
    

    Output:

    enter image description here