I'm looking for an automated way to declare the .spec
-part from aggregate_key
, starting from a vector of strings containing the names of the variables linked to the different levels.
The following of course doesn't work, but everything I tried with adding !!as.name()
or using do.call
, ended in failure.
levels <- paste( c("L1",'L2','L3'), collapse = '/')
mytsibble %>% aggregate_key(levels, value = sum(value))
fabletools::aggregate_key()
supports tidyverse style !!
operations for non-standard evaluation.
This allows you to construct the expression however you like, and use it within aggregate_key()
using !!expression
.
For example, using rlang::parse_expr()
to convert a string into an expression:
library(fpp3)
tourism %>%
aggregate_key(Purpose*(State/Region), Trips = sum(Trips))
#> # A tsibble: 34,000 x 5 [1Q]
#> # Key: Purpose, State, Region [425]
#> Quarter Purpose State Region Trips
#> <qtr> <chr*> <chr*> <chr*> <dbl>
#> 1 1998 Q1 <aggregated> <aggregated> <aggregated> 23182.
#> 2 1998 Q2 <aggregated> <aggregated> <aggregated> 20323.
#> 3 1998 Q3 <aggregated> <aggregated> <aggregated> 19827.
#> 4 1998 Q4 <aggregated> <aggregated> <aggregated> 20830.
#> 5 1999 Q1 <aggregated> <aggregated> <aggregated> 22087.
#> 6 1999 Q2 <aggregated> <aggregated> <aggregated> 21458.
#> 7 1999 Q3 <aggregated> <aggregated> <aggregated> 19914.
#> 8 1999 Q4 <aggregated> <aggregated> <aggregated> 20028.
#> 9 2000 Q1 <aggregated> <aggregated> <aggregated> 22339.
#> 10 2000 Q2 <aggregated> <aggregated> <aggregated> 19941.
#> # … with 33,990 more rows
levels <- rlang::parse_expr("Purpose*(State/Region)")
tourism %>%
aggregate_key(.spec = !!levels, Trips = sum(Trips))
#> # A tsibble: 34,000 x 5 [1Q]
#> # Key: Purpose, State, Region [425]
#> Quarter Purpose State Region Trips
#> <qtr> <chr*> <chr*> <chr*> <dbl>
#> 1 1998 Q1 <aggregated> <aggregated> <aggregated> 23182.
#> 2 1998 Q2 <aggregated> <aggregated> <aggregated> 20323.
#> 3 1998 Q3 <aggregated> <aggregated> <aggregated> 19827.
#> 4 1998 Q4 <aggregated> <aggregated> <aggregated> 20830.
#> 5 1999 Q1 <aggregated> <aggregated> <aggregated> 22087.
#> 6 1999 Q2 <aggregated> <aggregated> <aggregated> 21458.
#> 7 1999 Q3 <aggregated> <aggregated> <aggregated> 19914.
#> 8 1999 Q4 <aggregated> <aggregated> <aggregated> 20028.
#> 9 2000 Q1 <aggregated> <aggregated> <aggregated> 22339.
#> 10 2000 Q2 <aggregated> <aggregated> <aggregated> 19941.
#> # … with 33,990 more rows
Created on 2022-07-29 by the reprex package (v2.0.1)
This would work with your example as:
levels <- rlang::parse_expr(paste( c("L1",'L2','L3'), collapse = '/'))
mytsibble %>% aggregate_key(!!levels, value = sum(value))
There are more robust ways to construct the expression (incase the variable names contain *
or /
), for example you could use rlang::call2()
with symbols and expressions.
library(rlang)
call2("*", sym("Purpose"), call2("/", sym("State"), sym("Region")))
#> Purpose * (State/Region)
Or equivalently (and more compactly) for your always nested example:
purrr::reduce(syms(c("L1",'L2','L3')), call2, .fn = "/")
#> L1/L2/L3
Created on 2022-07-29 by the reprex package (v2.0.1)
These expressions can then be used with aggregate_key()
using !!
once again.
Why didn't !!as.name(levels)
work?
as.name()
produces a name (in rlang/tidyverse this is known as a 'symbol'), not an expression. A name/symbol can be thought of as a name of an object, or the name of a variable in the data. Using !!as.name(levels)
will try to produce an aggregation of a column named "L1/L2/L3"
, not a nested hierachy of columns "L1"
, "L2"
, and "L3"
. For this, you need an expression.