pythonpandasdataframematplotlibmulti-index

plotting vertical lines on pandas line plot with multiindex x axis


I have a dataframe whose index is a multiindex where axes[0] is the date, and axis[1] is the rank. Rank starts with 1 and ends at 100, but there can be a variable number of ranks in between as below. Here are the ranks

dx = pd.DataFrame({
    "date": [
        pd.to_datetime('2025-02-24'), pd.to_datetime('2025-02-24'), pd.to_datetime('2025-02-24'), pd.to_datetime('2025-02-24'),
        pd.to_datetime('2025-02-25'), pd.to_datetime('2025-02-25'), pd.to_datetime('2025-02-25'), 
        pd.to_datetime('2025-02-26'), pd.to_datetime('2025-02-26'), pd.to_datetime('2025-02-26'), pd.to_datetime('2025-02-26'), pd.to_datetime('2025-02-26')
    ],
     "rank": [0.0,1.0,2.0,100.0,0.0,1.0,100.0,0.0,1.0,2.0,3.0,100.0],
    "value": [2.3, 2.5, 2.4, 2.36, 2.165, 2.54, 2.34, 2.12, 2.32, 2.43, 2.4, 2.3]
})

dx.set_index(["date", "rank"], inplace=True)

I want to plot this df, and df.plot() works fine creating a reasonable x-axis. However, I want to add a grid or vertical lines at all the rank=1, and all the rank=100(different color).

I tried this :


fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(30, 5))

dx.plot(ax=axes[0])
axes[0].tick_params('x', labelrotation=90)
xs = [x for x in dx.index if x[1]==0]

for xc in xs:
    axes[0].axvline(x=xc, color='blue', linestyle='-')

but get this error:

ConversionError: Failed to convert value(s) to axis units: (Timestamp('2025-02-24 00:00:00'), 0.0)

I also want to only show x labels for rank=0, and not all of them. Currently, if i set label rotation to 90, it results in that but not sure if this is the best way to ensure that.

axes[0].tick_params('x', labelrotation=90)

So looking for 2 answers

  1. How to set vertical lines at specific points with this type of multiindex
  2. How to ensure only certain x labels show on the chart

Solution

  • With a categorical axis, plt will use an integer index "under the hood". Here, since you are using a lineplot, it tries to come up with a reasonable step:

    dx.plot(ax=axes[0])
    axes[0].get_xticks()
    
    # array([-2.,  0.,  2.,  4.,  6.,  8., 10., 12.])
    

    With a barplot, you would get the more logical:

    dx.plot.bar(ax=axes[0])
    axes[0].get_xticks()
    
    # array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
    

    You can use Axes.set_xticks and Axes.set_xticklabels to fix this. E.g.,

    ticks = range(len(dx))
    
    # only label at rank 0
    labels = [f"{x[0].strftime('%Y-%m-%d')}, {int(x[1])}" 
              if x[1] == 0 else '' for x in dx.index]
    
    axes[0].set_xticks(ticks=ticks)
    axes[0].set_xticklabels(labels=labels, rotation=90)
    

    It's easier to see now that we need the appropriate index matches for Axes.axvline. We can apply np.nonzero to Index.get_level_values and then add the lines in a loop:

    def add_vlines(ax, rank, color):
        indices = np.nonzero(dx.index.get_level_values('rank') == rank)[0]
        for index in indices:
            ax.axvline(x=index, color=color, linestyle='dotted')
    
    add_vlines(axes[0], 1, 'blue')
    add_vlines(axes[0], 100, 'red')
    

    Output:

    plot