pythonpandasmatplotlib

Plot timedelta in matplotlib


I'm reading in some year, time (duration) data and I want to plot a chart of year on the x axis and time (duration) on the y axis. I want the y axis to have a HH:MM:SS format. I can't figure out how to do it. Here's my code (data is synthesized, real data set is much larger).

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, date2num

df = pd.DataFrame({'Year':[2010, 2011, 2012], 
                   'Time': ['2:19:15', '2:11:16', '2:20:17']})
df['Time'] = pd.to_timedelta(df['Time'])

fig, ax = plt.subplots()

myFmt = DateFormatter("%H:%M:%S")
ax.yaxis.set_major_formatter(myFmt)

ax.plot(df['Year'], df['Time'].dt.total_seconds())

plt.gcf().autofmt_xdate()

plt.show()

If I don't convert to total_seconds, I get an error. It seems like the total seconds values are being interpreted as days. I tried dividing total_seconds by 24*60*60, but that gave me a message about a 0 date. I can't persuade date2num to work for me either.

I've checked on previous similar questions, but the code no longer works.

Does anyone know how to plot Pandas timedeltas in matplotlib?


Solution

  • Convert the timedeltas to datetime with pd.to_datetime(). It will give everything a date in 1970, but if all you want is to plot and display then it won't matter. You then need to get rid of the .dt.total_seconds() too.

    import pandas as pd
    import matplotlib.pyplot as plt
    from matplotlib.dates import DateFormatter, date2num
    
    df = pd.DataFrame({'Year':[2010, 2011, 2012], 
                        'Time': ['2:19:15', '2:11:16', '2:20:17']})
    df['Time'] = pd.to_timedelta(df['Time'])
    df['Time'] = pd.to_datetime(df['Time'])
    fig, ax = plt.subplots()
    
    myFmt = DateFormatter("%H:%M:%S")
    ax.yaxis.set_major_formatter(myFmt)
    
    ax.plot(df['Year'], df['Time'])
    
    plt.gcf().autofmt_xdate()
    
    plt.show()
    

    Plot Result