rtime-seriesforecast

Time series, not able to forecast with multiple features, works with single features


When I create a time series forecast for a single feature, everything works fine:

library(fpp3)
library(tidyverse)

fit_consMR <- us_change |>
  model(tslm = TSLM(Consumption))
fit_consMR %>% 
  forecast(h = 4)

This result is returned:

# A fable: 4 x 4 [1Q]
# Key:     .model [1]
  .model Quarter   Consumption .mean
  <chr>    <qtr>        <dist> <dbl>
1 tslm   2019 Q3 N(0.74, 0.41) 0.742
2 tslm   2019 Q4 N(0.74, 0.41) 0.742
3 tslm   2020 Q1 N(0.74, 0.41) 0.742
4 tslm   2020 Q2 N(0.74, 0.41) 0.742

However, errors are returned when using more than one feature to forecast:

fit_consMR <- us_change |>
  model(tslm = TSLM(Consumption ~ Income + Production + Savings + Unemployment))
fit_consMR %>% 
  forecast(h = 4)

returns the error message:

Caused by error in `value[[3L]]()`:
! object 'Income' not found
  Unable to compute required variables from provided `new_data`.
  Does your model require extra variables to produce forecasts?

That's odd, because it worked for a single feature without needing new data, and it will calculate the multi-feature model report correctly:

fit_consMR <- us_change |>
  model(tslm = TSLM(Consumption ~ Income + Production + Savings + Unemployment))
fit_consMR %>% 
  report()
Series: Consumption 
Model: TSLM 

Residuals:
     Min       1Q   Median       3Q      Max 
-0.90555 -0.15821 -0.03608  0.13618  1.15471 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   0.253105   0.034470   7.343 5.71e-12 ***
Income        0.740583   0.040115  18.461  < 2e-16 ***
Production    0.047173   0.023142   2.038   0.0429 *  
Savings      -0.052890   0.002924 -18.088  < 2e-16 ***
Unemployment -0.174685   0.095511  -1.829   0.0689 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3102 on 193 degrees of freedom
Multiple R-squared: 0.7683, Adjusted R-squared: 0.7635
F-statistic:   160 on 4 and 193 DF, p-value: < 2.22e-16

It will also create accuracy measures for the multi-feature model:

fit_consMR <- us_change |>
  model(tslm = TSLM(Consumption ~ Income + Production + Savings + Unemployment))
fit_consMR %>% 
  glance()

returns:

# A tibble: 1 × 15
  .model r_squared adj_r_squared sigma2 statistic  p_value    df log_lik   AIC  AICc   BIC    CV deviance df.residual  rank
  <chr>      <dbl>         <dbl>  <dbl>     <dbl>    <dbl> <int>   <dbl> <dbl> <dbl> <dbl> <dbl>    <dbl>       <int> <int>
1 tslm       0.768         0.763 0.0962      160. 3.93e-60     5   -46.7 -457. -456. -437. 0.104     18.6         193     5

What are the steps to forecast multi-feature time series model?


Solution

  • If you specify other predictors then you need to also specify future values otherwise the model has no way to know what they should be.

    Below is some new data that carries forward the final values of us_change data starting 2019 Q2 ending 2020 Q1:

    library(fpp3)
    library(tidyverse)
    
      newdata <- tsibble(
      qtr = rep(yearquarter("2019 Q2") + 0:3, 1),
      Income = rep(0.593,4),
      Production = rep(-0.54,4),
      Savings = rep(-4.26,4),
      Unemployment = rep(-0.1,4)
    )
    
    fit_consMR <- us_change |>
      model(tslm = TSLM(Consumption ~ Income + Production + Savings + Unemployment))
    fit_consMR %>% 
      forecast(new_data = newdata)
    

    Two options are choosing a reasonable static value for your explanatory variables if you believe that will hold for some period, or forecast them individually and use the forecasts.