I'm working on creating a time series box plot that spans multiple years, and I want the x-axis to display only the month and year for the 1st of each month. I would like to limit the labels to every month, or maybe every couple of months.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import matplotlib.dates as mdates
import random
Dates = pd.date_range(start="2020-01-01", end="2020-02-15", freq='D').date
data = {'Date': [], 'Values': []}
for d in Dates:
data['Date'].extend([d] * 10)
data['Values'].extend(random.sample(range(1, 101), 10))
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
sns.boxplot(x='Date', y='Values', data=df)
plt.xticks(rotation=45)
plt.title('Time Series Box Plot')
plt.ylabel('Values')
plt.xlabel('Date')
plt.tight_layout()
plt.show()
I tried adding this to the end, but for some reason it makes the year 1970.
ax = plt.gca()
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()
You were almost there, just need one more instruction:
import matplotlib.dates as mdates
mdates.set_epoch('2020-01-01T00:00:00')
It is because in Matplotlib the 0 date is the 01/01/1970.
This instruction has to be put at the start of your code and if you are in Jupyter you will need to restart your kernel.