pythonseabornjupyter

Superimposing plots in seaborn cause x-axis to misallign


I am having an issue trying to superimpose plots with seaborn. I am able to generate the two plots separately as

fig, (ax1,ax2) = plt.subplots(ncols=2,figsize=(30, 7))

sns.lineplot(data=data1, y='MSE',x='pct_gc',ax=ax1)

sns.boxplot(x="pct_gc", y="MSE", data=data2,ax=ax2,width=0.4)

The output looks like this:

Plots not superimposed

But when I try to put both plots superimposed, but assigning both to the same ax object.

fig, (ax1,ax2) = plt.subplots(ncols=2,figsize=(30, 7))

sns.lineplot(data=data1, y='MSE',x='pct_gc',ax=ax1)

sns.boxplot(x="pct_gc", y="MSE", data=data2,ax=ax2,width=0.4)

Plots superimposed

I am not able to identify with the X axis in the Lineplot changes when superimposing both plots (both plots X axis go from 0 to 0.069).

My goal is for both plots to be superimposed, while keeping the same X axis range.


Solution

  • Seaborn's boxplot creates categorical x-axis, with all boxes nicely with the same distance. Internally the x-axis is numbered as 0, 1, 2, ... but externally it gets the labels from 0 to 0.069.

    To combine a line plot with a boxplot, matplotlib's boxplot can be addressed directly, so that positions and widths can be set explicitly. When patch_artist=True, a rectangle is created (instead of just lines), for which a facecolor can be given. manage_ticks=False prevents that boxplot changes the x ticks and their limits. Optionally notch=True would accentuate the median a bit more, but depending on the data, the confidence interval might be too large and look weird.

    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd
    import seaborn as sns
    
    data1 = pd.DataFrame({'pct_gc': np.linspace(0, 0.069, 200), 'MSE': np.random.normal(0.02, 0.1, 200).cumsum()})
    data1['pct_range'] = pd.cut(data1['pct_gc'], 10)
    
    fig, ax1 = plt.subplots(ncols=1, figsize=(20, 7))
    
    sns.lineplot(data=data1, y='MSE', x='pct_gc', ax=ax1)
    
    for interval, color in zip(np.unique(data1['pct_range']), plt.cm.tab10.colors):
        ax1.boxplot(data1[data1['pct_range'] == interval]['MSE'],
                    positions=[interval.mid], widths=0.4 * interval.length,
                    patch_artist=True, boxprops={'facecolor': color},
                    notch=False, medianprops={'color':'yellow', 'linewidth':2},
                    manage_ticks=False)
    plt.show()
    

    example plot