pythonintervalspredictionforecasting

Conformal prediction intervals insample data nixtla


Given the documentation of nixtla y dont find any way to compute the prediction intervals for insample prediction (training data) but just for future predicitons.

I put an example of what I can achieve but just to predict (future).

from statsforecast.models import SeasonalExponentialSmoothing, ADIDA, ARIMA
from statsforecast.utils import ConformalIntervals

# Create a list of models and instantiation parameters 
intervals = ConformalIntervals(h=24, n_windows=2)

models = [
    SeasonalExponentialSmoothing(season_length=24,alpha=0.1, prediction_intervals=intervals),
    ADIDA(prediction_intervals=intervals),
    ARIMA(order=(24,0,12), season_length=24, prediction_intervals=intervals),
]

sf = StatsForecast(
    df=train, 
    models=models, 
    freq='H', 
)

levels = [80, 90] # confidence levels of the prediction intervals 

forecasts = sf.forecast(h=24, level=levels)
forecasts = forecasts.reset_index()
forecasts.head()

So my goal will be to do something like:

 forecasts = sf.forecast(df_x, level=levels)

So we can have any prediction intervals in the training set.


Solution

  • You can access the in-sample forecast with a conformal prediction interval using the forecast_fitted_values method.

    1. Your selected models need to support in-sample fitted values. From your example code, SeasonalExponentialSmoothing and ADIDA didn't support in-sample fitted values currently. You can find the list of supported models from official documentation
    2. You need to specify the fitted=True argument in the forecast step.
    3. Then, you can access the in-sample forecast with a conformal prediction interval using the forecast_fitted_values method.
    import pandas as pd
    from statsforecast import StatsForecast
    from statsforecast.models import SeasonalExponentialSmoothing, ADIDA, ARIMA
    from statsforecast.utils import ConformalIntervals
    
    train = pd.read_csv('https://auto-arima-results.s3.amazonaws.com/M4-Hourly.csv')
    test = pd.read_csv('https://auto-arima-results.s3.amazonaws.com/M4-Hourly-test.csv').rename(columns={'y': 'y_test'})
    n_series = 1
    uids = train['unique_id'].unique()[:n_series] # select first n_series of the dataset
    train = train.query('unique_id in @uids')
    test = test.query('unique_id in @uids')
    
    # Create a list of models and instantiation parameters 
    intervals = ConformalIntervals(h=24, n_windows=2)
    
    models = [
        ARIMA(order=(24,0,12), season_length=24, prediction_intervals=intervals),
    ]
    
    sf = StatsForecast(
        df=train, 
        models=models, 
        freq='H', 
        n_jobs=-1
    )
    
    levels = [80, 90] # confidence levels of the prediction intervals 
    forecasts = sf.forecast(h=24, level=levels, fitted=True) # Add fitted=True to store in-sample predictions.
    insample_forecasts = sf.forecast_fitted_values() # Access insample predictions
    

    sample