I have two time series representing two independent periods of data observation. I would like to fit an autoregressive model to this data. In other words, I would like to perform two partial fits, or two sessions of incremental learning.
This is a simplified description of a not-unusual scenario which could also apply to batch fitting on large datasets.
How do I do this (in statsmodels or otherwise)? Bonus points if the solution can generalise to other time-series models like ARIMA.
In pseudocode, something like:
import statsmodels.api as sm
from statsmodels.tsa.ar_model import AutoReg
data = sm.datasets.sunspots.load_pandas().data['SUNACTIVITY']
data_1 = data[:len(data)//3]
data_2 = data[len(data)-len(data)//3:]
# This is the standard single fit usage
res = AutoReg(data_1, lags=12).fit()
res.aic
# This is more like what I would like to do
model = AutoReg(lags=12)
model.partial_fit(data_1)
model.partial_fit(data_2)
model.results.aic
Statsmodels does not directly have this functionality. As Kevin S mentioned though, pmdarima does have a wrapper that provides this functionality. Specifically the update method. Per their documentation: "Update the model fit with additional observed endog/exog values.".
See example below around your particular code:
from pmdarima.arima import ARIMA
import statsmodels.api as sm
data = sm.datasets.sunspots.load_pandas().data['SUNACTIVITY']
data_1 = data[:len(data)//3]
data_2 = data[len(data)-len(data)//3:]
# This is the standard single fit usage
model = ARIMA(order=(12,0,0))
model.fit(data_1)
# update the model parameters with the new parameters
model.update(data_2)