pythonpandasseabornlegendhue

Seaborn kdeplot: inconsistent hue


I have a pandas dataframe which contains information about distances moved by men and women in different provinces. Apart from an id and the distance, there is also a column for their gender in numerical form (0=men, 1=women, 2=unknown), and a gender label for the legend ('gender_legend' with 'male' and 'female').

I'm trying to plot the relative densities for men and women for each province, and I observed some annoying behaviour: sometimes, the plot for men is drawn in blue and the one for women in orange and sometimes the other way around, with the legends sometimes starting with men and sometimes starting with women (see images). Does anybody have any idea why this is the case, and how to force seaborn to always use the same color for the same gender?

Additionally, if anyone knows how to remove the legend title (here: 'gender_legend'), I'd appreciate this, too. I've already unsuccessfully tried these options.

for province in provinces:
    fig, ax = plt.subplots()
    sns.kdeplot(data=df[(-(df['gender'] == 2)) & (df['province'] == province)], x='distance', hue='gender_legend', ax=ax)
    ax.set(xlabel='Distance (km)', ylabel='density', title=province)
    plt.show()

Image 1: men=blue; Image 2: men=orange


Solution

  • for province in provinces:
        fig, ax = plt.subplots()
        # to sort dataframe by gender so male is always on top
        df = df.sort_values(by=['gender'], ascending=True)
        # add legend = False to remove legend
        sns.kdeplot(data=df[(-(df['gender'] == 2)) & (df['province'] == province)], x='distance', hue='gender_legend', ax=ax, legend=False)
        ax.set(xlabel='Distance (km)', ylabel='density', title=province)
        plt.show()
    

    Answer explanation:

    1. Seaborn puts male or female on top based on the top row of your dataframe. In your case, it is changing. You need to make sure male is always on top by sorting using gender. Then you will always have blue line for male.
    2. The answer you linked for legend removal actually explains how to remove legend title not legend itself. You just need to provide "legend=False" as a parameter to remove legend.