How can I use features in statsforecast (e.g. moving average, lags, user defined function)?
fcst = StatsForecast(
m4_daily_train,
models = [(auto_arima,7)],
freq = 'D',
n_jobs = min(len(m4_daily_train.index.unique()),cpu_count())
)
Or is it possible to create the features on my own in a previous step in pandas and use then the total feature table in the fitting like...
df['lag1'] = df['y'].shift(1)
df['day'] = df['timestamp'].dt.day
fcst = StatsForecast(
df,
models = [(auto_arima,7)],
freq = 'D',
n_jobs = min(len(m4_daily_train.index.unique()),cpu_count())
)
You can use exogenous variables in the statsforecast
by passing the training dataset which includes unique_id
, ds
, y
, and exogenous
variables, and the testing dataset which includes unique_id
, ds
, and future exogenous
variable in the forecast
step. For more information about Exogenous Regressors, you can find from official documentation
from datasetsforecast.m5 import M5
from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA
Y_df, X_df, *_ = M5.load('./data')
# Filter data
Y_ts = Y_df[Y_df['unique_id'] == 'FOODS_3_586_CA_3'].reset_index(drop = True)
X_ts = X_df[X_df['unique_id'] == 'FOODS_3_586_CA_3'].reset_index(drop = True)
X_ts = X_ts[['unique_id', 'ds', 'sell_price', 'snap_CA']]
X_ts['unique_id'] = X_ts.unique_id.astype(str)
# Extract dates for train and test set
dates = Y_df['ds'].unique()
dtrain = dates[:-28]
dtest = dates[-28:]
Y_train = Y_ts.query('ds in @dtrain')
Y_test = Y_ts.query('ds in @dtest')
X_train = X_ts.query('ds in @dtrain')
X_test = X_ts.query('ds in @dtest')
# Add exogenous regressors
train = Y_train.merge(X_ts, how = 'left', on = ['unique_id', 'ds'])
models = [AutoARIMA(season_length = 7)]
sf = StatsForecast(
models=models,
freq='D',
n_jobs=-1
)
fcst = sf.forecast(df=train, h=28, X_df=X_test, level=[95])
fcst = fcst.reset_index()
fcst.head()