Why doesn't using the : operator when passing variables to the fable::VAR function work? I have a large number of variables and using the , separator inside vars
works just fine, but when I switch to the : separator, it breaks. Is there an alternative way to specify a large number of variables without having to list them each individually using the , separator?
See example code:
library(fable)
library(tidyverse)
lung_deaths <- cbind(mdeaths, fdeaths) %>%
as_tsibble(pivot_longer = FALSE)
lung_deaths %>%
model(VAR(vars(mdeaths:fdeaths) ~ AR(1))) %>%
tidy()
This produces the following warning:
Warning message:
In mdeaths:fdeaths :
numerical expression has 72 elements: only the first used
And the output makes no sense:
# A tibble: 2 × 7
.model term .response estimate std.error statistic p.value
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 VAR(vars(mdeaths:fdeaths) ~ AR(1)) lag(mdeaths:fdeaths,1) mdeaths:fdeaths 1.00 9.52e-17 1.05e16 0
2 VAR(vars(mdeaths:fdeaths) ~ AR(1)) constant
Compare with the correct result using vars(mdeaths,fdeaths)
:
# A tibble: 6 × 7
.model term .response estimate std.error statistic p.value
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 VAR(vars(mdeaths, fdeaths) ~ AR(1)) lag(mdeaths,1) mdeaths 0.872 0.362 2.41 0.0185
2 VAR(vars(mdeaths, fdeaths) ~ AR(1)) lag(fdeaths,1) mdeaths -0.281 0.871 -0.322 0.748
3 VAR(vars(mdeaths, fdeaths) ~ AR(1)) constant mdeaths 337. 126. 2.68 0.00925
4 VAR(vars(mdeaths, fdeaths) ~ AR(1)) lag(mdeaths,1) fdeaths 0.299 0.150 2.00 0.0500
5 VAR(vars(mdeaths, fdeaths) ~ AR(1)) lag(fdeaths,1) fdeaths 0.0253 0.361 0.0700 0.944
6 VAR(vars(mdeaths, fdeaths) ~ AR(1)) constant fdeaths 93.5 52.2 1.79 0.0776
Currently you would need to construct the formula's lhs yourself, which can be done programmatically. There are plans to allow for tidyselect style selectors like mdeaths:fdeaths
, with a working version here: https://github.com/tidyverts/fabletools/pull/361
Here's how you could currently select multiple variables for the model response variables, and construct the formula from them programmatically.
library(fable)
#> Loading required package: fabletools
library(tidyverse)
lung_deaths <- cbind(mdeaths, fdeaths) %>%
as_tsibble(pivot_longer = FALSE)
# Select the columns you want with tidyselect syntax
cols <- tidyselect::eval_select(expr(mdeaths:fdeaths), data = lung_deaths)
# Create the formula with lhs and rhs
fm <- rlang::new_formula(
lhs = rlang::call2("vars", !!!syms(names(cols))),
rhs = expr(AR(1))
)
# Use formula in model
lung_deaths %>%
model(VAR(fm)) %>%
tidy()
#> # A tibble: 6 × 7
#> .model term .response estimate std.error statistic p.value
#> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 VAR(fm) lag(mdeaths,1) mdeaths 0.872 0.362 2.41 0.0185
#> 2 VAR(fm) lag(fdeaths,1) mdeaths -0.281 0.871 -0.322 0.748
#> 3 VAR(fm) constant mdeaths 337. 126. 2.68 0.00925
#> 4 VAR(fm) lag(mdeaths,1) fdeaths 0.299 0.150 2.00 0.0500
#> 5 VAR(fm) lag(fdeaths,1) fdeaths 0.0253 0.361 0.0700 0.944
#> 6 VAR(fm) constant fdeaths 93.5 52.2 1.79 0.0776
Created on 2023-12-30 with reprex v2.0.2