I'm struggling to understand the correct way to calculate 1-step forecast errors over a test set using the {fable}
package for R.
First, my understanding of a 1-step forecast error is that we:
I've had a look at this post: Does the forecast function within fable provide one-step forecasts? which aligns with the method discussed in FPP3, but I'm getting results which do not match my intuition. This method also fails when the test set is smaller than the seasonality window of the fitted models. Additionally, what do you do when there is no refit()
method available for a forecast algorithm? (such as the provided THETA()
technique).
Below is a code snipped with 2 different ways I believe should calculate 1-step forecast errors over a 24m test set after a model has been fit. The first is based on the method in the post linked above and the second is a loop that I came up with.
The accuracy and forecasts are produced for both methods, but they differ by a not-insignificant amount. Why are they different? Which one is correct?
In general, if I have a model and make a prediction for t+1, observe y_{t+1}, it does not feel clear how I would use the trained model and the new observation to make a prediction for t+2.
# Init
library(tidyverse)
library(fable)
#> Loading required package: fabletools
# Prepare data
us_accidental_deaths <- as_tsibble(USAccDeaths)
deaths_train <- head(us_accidental_deaths, -24)
deaths_test <- tail(us_accidental_deaths, 24)
# Get models on training data
deaths_fits_0 <- deaths_train %>%
model(ets = ETS(value))
##############
# Normal way #
##############
# 'refit' without estimating new params/coefs on the new data
deaths_fits_1 <- deaths_fits_0 %>%
refit(deaths_test, reestimate = FALSE)
# what happens here if the test set is smaller than the seasonality windows?
# 1-step forecast accuracy on the test set?
deaths_fits_1 %>%
accuracy()
#> # A tibble: 1 x 10
#> .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 ets Training 30.4 142. 111. 0.334 1.24 0.428 0.416 -0.105
# The 1-step forecasts?
fitted(deaths_fits_1)
#> # A tsibble: 24 x 3 [1M]
#> # Key: .model [1]
#> .model index .fitted
#> <chr> <mth> <dbl>
#> 1 ets 1977 Jan 7770.
#> 2 ets 1977 Feb 6906.
#> 3 ets 1977 Mar 7680.
#> 4 ets 1977 Apr 8135.
#> 5 ets 1977 May 8964.
#> 6 ets 1977 Jun 9264.
#> 7 ets 1977 Jul 10457.
#> 8 ets 1977 Aug 9547.
#> 9 ets 1977 Sep 8520.
#> 10 ets 1977 Oct 8660.
#> # ... with 14 more rows
################
# Using a loop #
################
# Initialise fit
death_fits_2 <- deaths_fits_0
# List to store 1 step forecasts in
test_forecasts_list <- list()
# Begin loop
for (i in 1:length(unique(deaths_test$index))){
# 1 step forecast
one_step_fc <- death_fits_2 %>%
forecast(h = 1)
# store
test_forecasts_list[[i]] <- one_step_fc
# refit using the newly observed datapoint
death_fits_2 <- deaths_fits_0 %>%
refit(
bind_rows(
deaths_train,
deaths_test %>%
arrange(index) %>%
slice_head(n=i)
),
reestimate=FALSE
)
}
test_forecasts_2 <- bind_rows(test_forecasts_list)
# 1-step forecast accuracy over test set
test_forecasts_2 %>%
accuracy(data=us_accidental_deaths)
#> # A tibble: 1 x 10
#> .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 ets Test 37.0 255. 194. 0.308 2.19 0.352 0.380 -0.305
# 1-step forecasts
test_forecasts_2
#> # A fable: 24 x 4 [1M]
#> # Key: .model [1]
#> .model index value .mean
#> <chr> <mth> <dist> <dbl>
#> 1 ets 1977 Jan N(7837, 110265) 7837.
#> 2 ets 1977 Feb N(7129, 95555) 7129.
#> 3 ets 1977 Mar N(7783, 93292) 7783.
#> 4 ets 1977 Apr N(7912, 91760) 7912.
#> 5 ets 1977 May N(8894, 89411) 8894.
#> 6 ets 1977 Jun N(9418, 87442) 9418.
#> 7 ets 1977 Jul N(10072, 85241) 10072.
#> 8 ets 1977 Aug N(9891, 89234) 9891.
#> 9 ets 1977 Sep N(8388, 93841) 8388.
#> 10 ets 1977 Oct N(8679, 91768) 8679.
#> # ... with 14 more rows
Created on 2022-05-31 by the reprex package (v2.0.1)
There are two groups of parameters in an ETS model, the initial states and the smoothing parameters. It rarely makes any sense to use the original initial states and apply them to a different time period, so by default refit()
re-estimates the initial states but keeps the smoothing parameters in the model provided.
In your first attempt, the initial states are being re-estimated at the start of the test data, while in your loop, the initial states are being re-estimated in every iteration.
What you probably want is not to re-estimate anything, but to see how well your model is doing on the test set. You can do that by refitting your model to the training and test set combined, and then to filter out the results on the test set. Like this:
# 'refit' without estimating new params/coefs on all data
deaths_fits_1 <- deaths_fits_0 %>%
refit(us_accidental_deaths, reinitialise=FALSE)
# 1-step forecasts on the test set
deaths_fits_1 %>%
fitted() %>%
tail(24)
That also avoids the problem of generating an error when the test set is smaller than the seasonal period.