I am working with non-standard evaluation in R. I have done group by and summarize in a dataframe using rlang as explained here
To follow the same example, I am left with a dataframe that looks like this:
x y q p
1 0 7 325.8 Inf
2 1 7 317.1 Inf
The calculations that were done here are these:
mtcars %>%
group_by(x = am) %>%
summarize(y = sum(vs),
q = sum(mpg),
p = sum(mpg/vs))
Now, imagine the original code would be
mtcars %>%
group_by(x = am) %>%
summarize(y = sum(vs),
q = sum(mpg),
p = sum(mpg / vs)) %>%
mutate(h = q / y)
instead.
How can I achieve this using NSE?
The problem I'm facing is that now the dataframe columns are x, y, q and p, not the original mtcars column names (in this example). Therefore, if I try to use mutate using an external vector like: y_delayed <- c("h" = "mpg / vs")
it doesn't work because those columns do no longer exist.
drpexpr <- rlang::parse_exprs(y_delayed)
mtcars%>%
transmute(!!!drpexpr)
Error in `transmute()`:
ℹ In argument: `h = mpg/vs`.
Caused by error:
! object 'mpg' not found
EDIT:
I cannot manually change the y_delayed <- c("h" = "mpg / vs")
to y_delayed <- c("h" = "q / y")
ADDING SOME ADDITIONAL CONTEXT:
As explained in the link I provide at the beginning of the question, the groups and the summarised expressions are given in 2 separate vectors:
x_groups <- c("x" = "am")
y_now <- c("y" = "vs", "q" = "mpg", "p" = "mpg/vs")
I then use this code provided in an answer to that question:
grpexpr <- rlang::parse_exprs(x_groups)
sexpr <- rlang::parse_exprs(y_now) |> lapply(function(x) bquote(sum(.(x))))
mtcars %>%
group_by(!!!grpexpr) %>%
summarize(!!!sexpr)
to get to the dataframe I have with columns x, y, q and p.
The problem now is that I need and additional step, which would be a mutate using dplyr, which column name(s) and calculations are specified in another external vector y_delayed <- c("h" = "mpg / vs")
that I cannot change.
Also, this is called y_delayed because it has to happen after the aggregation step.
EDIT 2:
The final dataframe should look like this:
x y q p h
1 0 7 325.8 Inf 46.54286
2 1 7 317.1 Inf 45.30000
You need to substitute the old variable names for the new variable names.
library(rlang)
library(dplyr)
# Make list of symbols to pass to substitute - need to swap names and values
yd_vars <- lapply(as.list(setNames(names(y_now), y_now)), as.name)
# Expressions to inject - lapply to vectorize if needed but also to autoname output
yd_expr <- lapply(y_delayed, \(x) do.call(substitute, list(parse_expr(x), yd_vars)))
Which results in:
$h
q/y
mtcars %>%
group_by(!!!grpexpr) %>%
summarize(!!!c(sexpr, yd_expr))
# A tibble: 2 × 5
x y q p h
<dbl> <dbl> <dbl> <dbl> <dbl>
1 0 7 326. Inf 46.5
2 1 7 317. Inf 45.3
You could also do it by text substitution but it will most likely be a more brittle approach:
yd_expr <- parse_exprs(setNames(stringr::str_replace_all(y_delayed, setNames(names(y_now), sprintf("\\b%s\\b",
y_now))), names(y_delayed)))