pythonpython-ggplotplotnine

How to add legend in ggplot/plotnine for multiple curves


Here is a sample code which I am using to plot two curves. How can I add the legends to the plot? I saw some post where it was suggested to add color within aes but that throws the exception

plotnine.exceptions.PlotnineError: "Could not evaluate the 'color' mapping: 'red' (original error: name 'red' is not defined)"

from plotnine import *
import numpy as np
import pandas as pd

str_metric = 'metric'
metric = np.array([0.127, 0.1715, 0.19166667, 0.21583333, 0.24866667, 0.24216667, 0.24433333,
                   0.255, 0.291, 0.30966667, 0.32033333, 0.2415, 0.33833333, 0.30583333, 0.34433333])

metric2 = metric * 2

iterations2 = [i for i in range(len(metric))]


df = pd.DataFrame({'iterations': iterations2,
                   str_metric: metric,
                   str_metric + '2': metric2})

p = ggplot(df, aes(x='iterations')) + geom_smooth(aes(y=metric), color='blue', show_legend=True, method='lm', span=0.10, se=True,
                                                  level=0.80) + geom_smooth(aes(y=metric2), color='red', show_legend=True, method='lm', span=0.10, se=True, level=0.80)
ggsave(p, filename='stackoverflow.png', path='plots/')

Solution

  • You are going about it the wrong way. Plotnine works best with tidy data, i.e each variable is a column and each observation a row. Otherwise, you will likely end up fighting with the plotting system.

    from plotnine import *
    import numpy as np
    import pandas as pd
    
    str_metric = 'metric'
    metric = np.array([0.127, 0.1715, 0.19166667, 0.21583333, 0.24866667, 0.24216667, 0.24433333,
                       0.255, 0.291, 0.30966667, 0.32033333, 0.2415, 0.33833333, 0.30583333, 0.34433333])
    
    metric2 = metric * 2
    
    iterations2 = [i for i in range(len(metric))]
    
    # tidy data
    df = pd.DataFrame({
        'iterations': np.hstack([iterations2, iterations2]),
        'value': np.hstack([metric, metric2]),
        'type': np.repeat(['metric', 'metric2'], len(iterations2))   
    })
    
    p = (ggplot(df, aes(x='iterations', y='value', color='type'))
         + geom_smooth(method='lm', span=0.10, se=True, level=0.80)
         # Then you can change the colour using a scale
        )
    

    enter image description here