When I create a time series forecast for a single feature, everything works fine:
library(fpp3)
library(tidyverse)
fit_consMR <- us_change |>
model(tslm = TSLM(Consumption))
fit_consMR %>%
forecast(h = 4)
This result is returned:
# A fable: 4 x 4 [1Q]
# Key: .model [1]
.model Quarter Consumption .mean
<chr> <qtr> <dist> <dbl>
1 tslm 2019 Q3 N(0.74, 0.41) 0.742
2 tslm 2019 Q4 N(0.74, 0.41) 0.742
3 tslm 2020 Q1 N(0.74, 0.41) 0.742
4 tslm 2020 Q2 N(0.74, 0.41) 0.742
However, errors are returned when using more than one feature to forecast:
fit_consMR <- us_change |>
model(tslm = TSLM(Consumption ~ Income + Production + Savings + Unemployment))
fit_consMR %>%
forecast(h = 4)
returns the error message:
Caused by error in `value[[3L]]()`:
! object 'Income' not found
Unable to compute required variables from provided `new_data`.
Does your model require extra variables to produce forecasts?
That's odd, because it worked for a single feature without needing new data, and it will calculate the multi-feature model report correctly:
fit_consMR <- us_change |>
model(tslm = TSLM(Consumption ~ Income + Production + Savings + Unemployment))
fit_consMR %>%
report()
Series: Consumption
Model: TSLM
Residuals:
Min 1Q Median 3Q Max
-0.90555 -0.15821 -0.03608 0.13618 1.15471
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.253105 0.034470 7.343 5.71e-12 ***
Income 0.740583 0.040115 18.461 < 2e-16 ***
Production 0.047173 0.023142 2.038 0.0429 *
Savings -0.052890 0.002924 -18.088 < 2e-16 ***
Unemployment -0.174685 0.095511 -1.829 0.0689 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.3102 on 193 degrees of freedom
Multiple R-squared: 0.7683, Adjusted R-squared: 0.7635
F-statistic: 160 on 4 and 193 DF, p-value: < 2.22e-16
It will also create accuracy measures for the multi-feature model:
fit_consMR <- us_change |>
model(tslm = TSLM(Consumption ~ Income + Production + Savings + Unemployment))
fit_consMR %>%
glance()
returns:
# A tibble: 1 × 15
.model r_squared adj_r_squared sigma2 statistic p_value df log_lik AIC AICc BIC CV deviance df.residual rank
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int>
1 tslm 0.768 0.763 0.0962 160. 3.93e-60 5 -46.7 -457. -456. -437. 0.104 18.6 193 5
What are the steps to forecast multi-feature time series model?
If you specify other predictors then you need to also specify future values otherwise the model has no way to know what they should be.
Below is some new data that carries forward the final values of us_change data starting 2019 Q2 ending 2020 Q1:
library(fpp3)
library(tidyverse)
newdata <- tsibble(
qtr = rep(yearquarter("2019 Q2") + 0:3, 1),
Income = rep(0.593,4),
Production = rep(-0.54,4),
Savings = rep(-4.26,4),
Unemployment = rep(-0.1,4)
)
fit_consMR <- us_change |>
model(tslm = TSLM(Consumption ~ Income + Production + Savings + Unemployment))
fit_consMR %>%
forecast(new_data = newdata)
Two options are choosing a reasonable static value for your explanatory variables if you believe that will hold for some period, or forecast them individually and use the forecasts.