rtime-seriesforecastingforecast

Time series, how to fix the error: "Future data set is incomplete"


Learning time series using Forecasting Principles and Practice, 3rd Edition.

The text includes a data set of accommodations:

library(fpp3)
library(tidyverse)
accommodations <- aus_accommodation

Set up the data set using cross-validation:

train <- accommodations %>%
  slice(-n()) %>% 
  stretch_tsibble(.init = 36, step = 1)

Fit a simple model:

fit <- train %>%
  model(
    TSLM(CPI ~ trend() + season())
    ) 

Construct a forecast:

forecast1 <- fit %>% 
  forecast(h=1)

Measure accuracy:

final <- forecast1 %>% 
  fabletools::accuracy(accommodations)

Warning message is returned: "Warning message: The future dataset is incomplete, incomplete out-of-sample data will be treated as missing. 1 observation is missing at 2016 Q3"

But there is nothing missing in forecast 1

tail(forecast1)

Nor is there anything missing at the end of the final forecast:

tail(final)

However, it is true that there are no observations at Q3 2016 or later in the original data set:

tail(accommodations)

I've changed the .init from 1 to 36, changed step from 1 to 4, and dropped it completely, everything is returning the same error that the future data set is incomplete.

How can the error be fixed?


Solution

  • It is not an error; it is a warning, just letting you know that you produced a forecast that it couldn't evaluate as one future observation was missing.

    Your stretched tsibble finishes in 2016 Q2, so your forecasts finish in 2016 Q3. The accommodations data also finish in 2016 Q2, so it can't compute the accuracy of the 2016 Q3 forecasts.

    The issue is that this data set contains 8 series, each of them finishing in 2016Q2. So when you slice off the last row (using slice(-n())), you are only removing the last row of one of the series, not the other 7.

    You could remove the last row off each of the 7 series, using the following:

    train <- accommodations %>%
      group_by(State) %>%
      slice(-n()) %>%
      ungroup() %>%
      stretch_tsibble(.init = 36, .step = 1)
    

    And then you would avoid the warning.

    But it is simpler not to do any slicing, and just use

    train <- accommodations %>%
      stretch_tsibble(.init = 36, .step = 1)
    

    You will get the same result, along with the warning.

    If you really want to avoid the warning, you can use

    final <- forecast1 %>% 
      fabletools::accuracy(accommodations) %>%
      suppressWarnings()