I've also opened an issue in the seaborn git repo here. However, instead of a bug, it's quite possible that instead I'm making some basic error - but I haven't figured it out yet...
Question: The same color is assigned to two different values in the legend. How to prevent this from happening as I scale up to more graphs that only contain a subset of the values assigned to the 'hue'?
The code to reproduce:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sna = pd.DataFrame(
{
'x' : np.tile(range(10), 5),
'y' : list(range(10)) + list(range(2,12)) + list(range(4, 14)) + list(range(6, 16)) + list(range(8, 18)),
'id_for_hue' : ['foo']*10 + ['bar']*10 + ['baz']*10 + ['bar']*10 + ['baz']*10,
'id_for_graph' : ['a']*30 + ['b']*20
}
)
g = sns.FacetGrid(sna, col='id_for_graph')
g.map_dataframe(
sns.lineplot,
x = 'x',
y = 'y',
hue = 'id_for_hue'
)
g.add_legend()
On the other hand, if I sort so that the values from the second plot appear at the top of the dataframe, the problem is resolved.
g = sns.FacetGrid(sna.sort_values(['id_for_hue']), col='id_for_graph')
g.map_dataframe(
sns.lineplot,
x = 'x',
y = 'y',
hue = 'id_for_hue'
)
g.add_legend()
Versions:
maybe unsurprisingly, I should have read the documentation more closely. As mentioned in the lineplot() documentation "Using relplot() is safer than using FacetGrid directly, as it ensures synchronization of the semantic mappings across facets". In fact, in the FacetGrid documentation it's pointed out that "in most cases you will want to use figure-level functions (e.g. displot(), relplot()) to make the plots".
So, use relplot(). e.g.
sns.relplot(
data = sna,
col = 'id_for_graph',
kind = 'line',
x = 'x',
y = 'y',
hue = 'id_for_hue'
)
Edit: Some additional information from mwaskom that might be helpful for others:
yes, this is exactly the kind of situation that the warning in the FacetGrid docstring is about:
relplot
is the recommended approach here; if you really needed to make your plot by initializing the facet grid (and couldn't just tweak the object thatrelplot
returns), you would have a few options:
- set a categorical dtype on your
hue
variable- pass
hue_order
tolineplot
- pass a dict of colors to
palette