rdplyrrlangnse

How to create new variables based on an external named list/vector of computations dplyr


Imagine I want to do the following operation:

library(dplyr)
mtcars %>%
    group_by(x = am) %>%
    summarize(y = sum(vs),
              q = sum(mpg),
              p = sum(mpg/vs))

which yields:

#> # A tibble: 2 × 4
#>       x     y     q     p
#>   <dbl> <dbl> <dbl> <dbl>
#> 1     0     7  326.   Inf
#> 2     1     7  317.   Inf

However, I would like to do the groupings and the summary based on these two external vectors:

x_groups <- c("x" = "am")
y_now <- c("y" = "vs", "q" = "mpg", "p" = "mpg/vs")

How can I have the same result but through a programmatic, non-standard evaluation approach?


Solution

  • You can parse your strings into expressions. The group by is easy, for the summarize, we need to transform to add the sum. But you can do

    grpexpr <- rlang::parse_exprs(x_groups)
    sexpr <- rlang::parse_exprs(y_now) |> lapply(function(x) bquote(sum(.(x))))
    

    and since those are named lists, you can inject them into the expression with !!!

    mtcars %>%
      group_by(!!!grpexpr) %>%
      summarize(!!!sexpr)