I did time series forecasting analysis with ExponentialSmoothing in python. I used statsmodels.tsa.holtwinters.
model = ExponentialSmoothing(df, seasonal='mul', seasonal_periods=12).fit()
pred = model.predict(start=df.index[0], end=122)
plt.plot(df_fc.index, df_fc, label='Train')
plt.plot(pred.index, pred, label='Holt-Winters')
plt.legend(loc='best')
I want to take confidence interval of the model result. But I couldn't find any function about this in "statsmodels.tsa.holtwinters - ExponentialSmoothing". How to I do that?
From this answer from a GitHub issue, it is clear that you should be using the new ETSModel
class, and not the old (but still present for compatibility) ExponentialSmoothing
.
ETSModel
includes more parameters and more functionality than ExponentialSmoothing
.
To calculate confidence intervals, I suggest you to use the simulate
method of ETSResults
:
from statsmodels.tsa.exponential_smoothing.ets import ETSModel
import pandas as pd
# Build model.
ets_model = ETSModel(
endog=y, # y should be a pd.Series
seasonal='mul',
seasonal_periods=12,
)
ets_result = ets_model.fit()
# Simulate predictions.
n_steps_prediction = y.shape[0]
n_repetitions = 500
df_simul = ets_result.simulate(
nsimulations=n_steps_prediction,
repetitions=n_repetitions,
anchor='start',
)
# Calculate confidence intervals.
upper_ci = df_simul.quantile(q=0.9, axis='columns')
lower_ci = df_simul.quantile(q=0.1, axis='columns')
Basically, calling the simulate
method you get a DataFrame with n_repetitions
columns, and with n_steps_prediction
steps (in this case, the same number of items in your training data-set y
).
Then, you calculate the confidence intervals with DataFrame quantile
method (remember the axis='columns'
option).
You could also calculate other statistics from the df_simul
.
I also checked the source code: simulate
is internally called by the forecast
method to predict steps in the future. So, you could also predict steps in the future and their confidence intervals with the same approach: just use anchor='end'
, so that the simulations will start from the last step in y
.
To be fair, there is also a more direct approach to calculate the confidence intervals: the get_prediction
method (which uses simulate
internally). But I do not really like its interface, it is not flexible enough for me, I did not find a way to specify the desired confidence intervals. The approach with the simulate
method is pretty easy to understand, and very flexible, in my opinion.
If you want further details on how this kind of simulations are performed, read this chapter from the excellent Forecasting: Principles and Practice online book.