pythonseabornscatter-plottrendlineseaborn-objects

How do you plot a trendline on labelled scatterplot points with the seaborn objects interface?


I'm using the seaborn.objects interface to label plot points. However, I can't add a trendline if the labels are present.

Adding the argument text='label' and the method .add(so.Line(color='orange'), so.PolyFit()) to so.Plot() in the first example does not render both labels and trendline together.

  1. Is there any way of having both present on the one plot?

  2. Furthermore, how could I plot an x=y line on either of these plots?

Plot with labelled plot points (working):

import seaborn.objects as so
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np

np.random.seed(42)
num_points = 10
df = pd.DataFrame({'x': np.random.randint(1, 100, size=num_points),
                   'y': np.random.randint(1, 100, size=num_points),
                   'label' : [chr(i + 65) for i in range(num_points)]})

fig, ax = plt.subplots()
p = so.Plot(data=df,
            x='x',
            y='y',
            text='label'
            ).add(so.Dot(marker='o')).add(so.Text(halign='left'))
p.on(ax).show()

enter image description here

Plot with trendline (working):

fig, ax = plt.subplots()
p = so.Plot(data=df,
            x='x',
            y='y',
            ).add(so.Dot(marker='o')).add(so.Line(color='orange'), so.PolyFit())
p.on(ax).show()

enter image description here

However, a plot with code for both labelled plot points and trendline only displays the former:

fig, ax = plt.subplots()
p = so.Plot(data=df,
            x='x',
            y='y',
            text='label',
            ).add(so.Dot(marker='o')).add(so.Text(halign='left')).add(so.Line(color='orange'), so.PolyFit())
p.on(ax).show()

enter image description here


Solution

  • In the example you provided, text='label' was set in so.Plot which results in the text labels being mapped to all layers. For some reason this global mapping then gets overridden in the Line/PolyFit layer. I found that if you instead set the text='label' mapping in the so.Text() layer, then it prevents it from being removed by other layers. In other words, just a small change to your code where you move text='label' further down:

    fig, ax = plt.subplots()
    p = (
        so.Plot(data=df, x='x', y='y')
        .add(so.Dot(marker='o'))
        .add(so.Text(halign='left'), text='label') #set the text mapping here
        .add(so.Line(color='orange'), so.PolyFit())
    )
    p.on(ax).show()
    

    Screenshot of result

    Regarding your second question about adding an identity line to the plot, I am not sure if there's a way using so, but since you're using matplotib, here's how I would do it:

    fig, ax = plt.subplots()
    p = (
        so.Plot(data=df, x='x', y='y')
        .add(so.Dot(marker='o'))
        .add(so.Text(halign='left'), text='label')
        .add(so.Line(color='orange'), so.PolyFit())
        .on(ax)
    )
    p_axes = p.plot()._figure.gca() #get the matplotlib handle of the p object
    p_axes.plot(p_axes.get_xlim(), p_axes.get_xlim()) #plot identity line
    

    enter image description here