Inconsistent forecast behavior when using lagged xregs in R fabletools

This is a question I opened as an issue but haven't heard from the package author, so I thought I would ask the question here. Thanks!

I am noticing some inconsistencies when forecasting with lagged xregs. Specifically, forecasts for h <= lag period. It seems like the historical data provided to the original model is not added to the new data before generating the forecast. In the example below I use the lag = 2 example from fpp3. The first forecast fc1 is identical to the one generated in the book. In the second forecast fc2 I augment the new_data by binding the historical advert data with the new advert data generated in insurance_future. When I do this I get a different forecast in fc2 vs fc1. It seems to me like the forecast in fc1 does not have access to the historical (xreg) data, so the TVaderts is treated as NA for the first two steps in the horizon. Is this correct? And if so, shouldn't that data be included as it is in fc2? This might be related to.

library(fpp3)
#> ── Attaching packages ──────────────────────────────────────────── fpp3 0.4.0 ──
#> ✓ tibble      3.1.2      ✓ tsibble     1.0.1 
#> ✓ dplyr       1.0.6      ✓ tsibbledata 0.3.0 
#> ✓ tidyr       1.1.3      ✓ feasts      0.2.1 
#> ✓ lubridate   1.7.10     ✓ fable       0.3.1 
#> ✓ ggplot2     3.3.3
#> ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
#> x lubridate::date()    masks base::date()
#> x dplyr::filter()      masks stats::filter()
#> x tsibble::intersect() masks base::intersect()
#> x tsibble::interval()  masks lubridate::interval()
#> x dplyr::lag()         masks stats::lag()
#> x tsibble::setdiff()   masks base::setdiff()
#> x tsibble::union()     masks base::union()
library(fabletools)
library(fable)
library(dplyr)
library(tsibble)

fit <- insurance %>%
  # Restrict data so models use same fitting period
  # Estimate models
  model(
    lag2 = ARIMA(Quotes ~ pdq(d = 0) +
                   TVadverts + lag(TVadverts) +
                   lag(TVadverts, 2))
  )

insurance_future <- new_data(insurance, 20) %>%
  mutate(TVadverts = 8)

# Forecast as shown in https://otexts.com/fpp3/lagged-predictors.html
fc1 <- fit %>%
  forecast(insurance_future)

# Manually pre-pend historic advert data to future data to ensure presence of
# lagged regressors
fc2 <- fit %>% 
  forecast(bind_rows(select(insurance, -Quotes), insurance_future)) %>%
  filter_index(as.character(min(insurance_future$Month)) ~ .)

print(fc1)
#> # A fable: 20 x 5 [1M]
#> # Key:     .model [1]
#>    .model    Month      Quotes .mean TVadverts
#>    <chr>     <mth>      <dist> <dbl>     <dbl>
#>  1 lag2   2005 May N(13, 0.23)  13.0         8
#>  2 lag2   2005 Jun N(13, 0.59)  13.0         8
#>  3 lag2   2005 Jul N(13, 0.72)  13.2         8
#>  4 lag2   2005 Aug N(13, 0.72)  13.2         8
#>  5 lag2   2005 Sep N(13, 0.72)  13.2         8
#>  6 lag2   2005 Oct N(13, 0.72)  13.2         8
#>  7 lag2   2005 Nov N(13, 0.72)  13.2         8
#>  8 lag2   2005 Dec N(13, 0.72)  13.2         8
#>  9 lag2   2006 Jan N(13, 0.72)  13.2         8
#> 10 lag2   2006 Feb N(13, 0.72)  13.2         8
#> 11 lag2   2006 Mar N(13, 0.72)  13.2         8
#> 12 lag2   2006 Apr N(13, 0.72)  13.2         8
#> 13 lag2   2006 May N(13, 0.72)  13.2         8
#> 14 lag2   2006 Jun N(13, 0.72)  13.2         8
#> 15 lag2   2006 Jul N(13, 0.72)  13.2         8
#> 16 lag2   2006 Aug N(13, 0.72)  13.2         8
#> 17 lag2   2006 Sep N(13, 0.72)  13.2         8
#> 18 lag2   2006 Oct N(13, 0.72)  13.2         8
#> 19 lag2   2006 Nov N(13, 0.72)  13.2         8
#> 20 lag2   2006 Dec N(13, 0.72)  13.2         8
print(fc2)
#> # A fable: 20 x 5 [1M]
#> # Key:     .model [1]
#>    .model    Month      Quotes .mean TVadverts
#>    <chr>     <mth>      <dist> <dbl>     <dbl>
#>  1 lag2   2005 May N(14, 0.72)  13.5         8
#>  2 lag2   2005 Jun N(13, 0.72)  13.3         8
#>  3 lag2   2005 Jul N(13, 0.72)  13.2         8
#>  4 lag2   2005 Aug N(13, 0.72)  13.2         8
#>  5 lag2   2005 Sep N(13, 0.72)  13.2         8
#>  6 lag2   2005 Oct N(13, 0.72)  13.2         8
#>  7 lag2   2005 Nov N(13, 0.72)  13.2         8
#>  8 lag2   2005 Dec N(13, 0.72)  13.2         8
#>  9 lag2   2006 Jan N(13, 0.72)  13.2         8
#> 10 lag2   2006 Feb N(13, 0.72)  13.2         8
#> 11 lag2   2006 Mar N(13, 0.72)  13.2         8
#> 12 lag2   2006 Apr N(13, 0.72)  13.2         8
#> 13 lag2   2006 May N(13, 0.72)  13.2         8
#> 14 lag2   2006 Jun N(13, 0.72)  13.2         8
#> 15 lag2   2006 Jul N(13, 0.72)  13.2         8
#> 16 lag2   2006 Aug N(13, 0.72)  13.2         8
#> 17 lag2   2006 Sep N(13, 0.72)  13.2         8
#> 18 lag2   2006 Oct N(13, 0.72)  13.2         8
#> 19 lag2   2006 Nov N(13, 0.72)  13.2         8
#> 20 lag2   2006 Dec N(13, 0.72)  13.2         8

waldo::compare(fc1, fc2)
#> `old$Quotes[[1]]$mu`: 13.0
#> `new$Quotes[[1]]$mu`: 13.5
#> 
#> `old$Quotes[[1]]$sigma`: 0.5
#> `new$Quotes[[1]]$sigma`: 0.8
#> 
#> `old$Quotes[[2]]$mu`: 13.0
#> `new$Quotes[[2]]$mu`: 13.3
#> 
#> `old$Quotes[[2]]$sigma`: 0.77
#> `new$Quotes[[2]]$sigma`: 0.85
#> 
#> `old$.mean[1:5]`: 13.0 13.0 13.2 13.2 13.2
#> `new$.mean[1:5]`: 13.5 13.3 13.2 13.2 13.2

Curiously, when I create new lagged variables manually (rather than in the formula) the model results match the "base case" from fpp3 (fc1 in my example).

insurance_manlag <- insurance %>%
  mutate(TVadverts1 = lag(TVadverts),
         TVadverts2 = lag(TVadverts, 2))

fit <- insurance_manlag %>%
  # Restrict data so models use same fitting period
  # Estimate models
  model(
    lag2 = ARIMA(Quotes ~ pdq(d = 0) +
                   TVadverts + TVadverts1 + TVadverts2)
  )

insurance_man_future <- append_row(insurance, n = 20) %>%
  replace_na(replace = list(TVadverts = 8)) %>%
  mutate(TVadverts1 = lag(TVadverts),
         TVadverts2 = lag(TVadverts, 2)) %>%
  slice_tail(n = 20)

# Forecast as shown in https://otexts.com/fpp3/lagged-predictors.html
fc3 <- fit %>%
  forecast(insurance_man_future)

waldo::compare(fc1$Quotes, fc3$Quotes)
#> ✓ No differences
waldo::compare(fc2$Quotes, fc3$Quotes)
#> `old[[1]]$mu`: 13.5
#> `new[[1]]$mu`: 13.0
#> 
#> `old[[1]]$sigma`: 0.8
#> `new[[1]]$sigma`: 0.5
#> 
#> `old[[2]]$mu`: 13.3
#> `new[[2]]$mu`: 13.0
#> 
#> `old[[2]]$sigma`: 0.85
#> `new[[2]]$sigma`: 0.77

^{Created on 2021-06-02 by the reprex package (v2.0.0)}

This reproduction leads me to believe that fc1 is correct, rather than fc2. If so, what is occurring in fc2 that causes it to have a different forecast vs that in fc1 (and fc3)?

Solution

In {fable}, models that produce forecasts retain all information needed to produce forecasts. When using the recommended interface to get fc1 (as shown in the book), the model is holding onto the 2 most recent values of TVadverts. While they were not needed to estimate the model, they are required inputs to produce the first couple of forecasts.

When using the forecast() function with new_data, the intended behaviour is for the model to produce forecasts for each time point in new_data. I believe forecasting from time points not at the end of the series is not yet implemented, so I will change this to produce an error.

Generally speaking, when using the lag() function in a model formula there is no need to prepend the historical data. The models will store and recall the needed values for forecasting.