pythondatetimematplotlib

matplotlib event.xdata out of timeries range


Having an issue using matplotlib event.xdata when plotting pandas.Timeseries, I tried to reproduce the answer proposed in a very related question, but get a very strange behavior.

Here's the code, adapted to python3 and with a little more stuff in the on_click() function:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import matplotlib.dates as mdates

def on_click(event):
    if event.inaxes is not None:
        # provide raw and converted x data
        print(f"{event.xdata} --> {mdates.num2date(event.xdata)}")
    # add a vertical line at clicked location
    line = ax.axvline(x=event.xdata)
    plt.draw()

t = pd.date_range('2015-11-01', '2016-01-06', freq='H')
y = np.random.normal(0, 1, t.size).cumsum()

df = pd.DataFrame({'Y':y}, index=t)

fig, ax = plt.subplots()
line = None
df.plot(ax=ax)
fig.canvas.mpl_connect('button_press_event', on_click)
plt.show()

If I launch this, I get the following diagram, with expected date range between Nov. 2015 and Jan. 2016, as is the cursor position information provided in the footer of the window (here 2015-11-01 10:00), and correct location of the vertical lines:

GUI result of matplotlib code

However, the command-line output is as follows:

C:\Users\me\Documents\code\>python matplotlib_even.xdate_num2date.py
402189.6454115977 --> 1102-02-27 15:29:23.562039+00:00
402907.10400704964 --> 1104-02-15 02:29:46.209088+00:00

Those event.xdata values are clearly out of both input data range and x axis data range, and are unusable for later use (like, try to find the closest y value in the serie).

So, does anyone know how I can get a correct xdata?


Solution

  • Something must have changed in the way matplotlib/pandas handles datetime info between the answer to the related question you linked and now. I cannot comment on why, but I found a solution to your problem.

    I went digging through the code that shows the coordinates in the bottom left of the status bar, and I found that when you're plotting a timeseries, pandas patches the functions that prints this info and replaces it with this one.

    From there, you can see that you need to convert the float value to a Period object.

    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    
    
    def on_click(event):
        print(pd.Period(ordinal=int(event.xdata), freq='H'))
    
    
    t = pd.date_range('2015-11-01', '2016-01-06', freq='H')
    y = np.random.normal(0, 1, t.size).cumsum()
    
    df = pd.DataFrame({'Y': y}, index=t)
    
    fig, ax = plt.subplots()
    df.plot(ax=ax)
    fig.canvas.mpl_connect('button_press_event', on_click)
    plt.show()