pythonpandasmatplotlibboxplottwinx

Matplotlib: Boxplot and bar chart shifted when overlaid using twinx


When I create a barplot and overlay a bar chart using twin x then the boxes appear shifted by one to the right compared to the bars.

This problem has been identified before (Python pandas plotting shift x-axis if twinx two y-axes), but the solution no longer seems to work. (I am using Matplotlib 3.1.0)

li_str = ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten']

df = pd.DataFrame([[i]+j[k] for i,j in {li_str[i]:np.random.randn(j,2).tolist() for i,j in \
    enumerate(np.random.randint(5, 15, len(li_str)))}.items() for k in range(len(j))]
    , columns=['A', 'B', 'C'])

fig, ax = plt.subplots(figsize=(16,6))
ax2 = ax.twinx()
df_gb = df.groupby('A').count()
p1 = df.boxplot(ax=ax, column='B', by='A', sym='')
p2 = df_gb['B'].plot(ax=ax2, kind='bar', figsize=(16,6)
    , colormap='Set2', alpha=0.3, secondary_y=True)
plt.ylim([0, 20])

The problematic chart

The output shows the boxes shifted to the right by one compared to the bars. The respondent of the previous post rightly pointed out that the tick-locations of the bars are zero-based and the tick-locations of the boxes are one-based, which is causing the shift. However, the plt.bar() method the respondent uses to fix it, now throws an error, since an x-parameter has been made mandatory. If the x-parameter is provided it still throws an error because there is no parameter 'left' anymore.

df.boxplot(column='B', by='A')
plt.twinx()
plt.bar(left=plt.xticks()[0], height=df.groupby('A').count()['B'],
  align='center', alpha=0.3)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-186-e257461650c1> in <module>
     26 plt.twinx()
     27 plt.bar(left=plt.xticks()[0], height=df.groupby('A').count()['B'],
---> 28         align='center', alpha=0.3)

TypeError: bar() missing 1 required positional argument: 'x'

In addition, I would much prefer a fix using the object-oriented approach with reference to the axes, because I want to place the chart into an interactive ipywidget.

Here is the ideal chart:

Ideal chart

Many thanks.


Solution

  • You can use the following trick: Provide the x-values for placing your bars starting at x=1. To do so, use range(1, len(df_gb['B'])+1) as the x-values.

    fig, ax = plt.subplots(figsize=(8, 4))
    ax2 = ax.twinx()
    df_gb = df.groupby('A').count()
    df.boxplot(column='B', by='A', ax=ax)
    ax2.bar(range(1, len(df_gb['B'])+1), height=df_gb['B'],align='center', alpha=0.3)
    

    enter image description here