I'm trying to get my grip on meta-programming methods in Advanced R, and not being a programmer by background, it is taking some effort. I am trying to write functions to manipulate dataframe columns without quoting (tidyverse style). This is easy enough when actually using dplyr
verbs by using enquo()
and !!
or curly-curly {{ }}
, but less intuitive when I want to do something inside a function akin to var <- c(df$colA, df$colB)
which doesn't require a tidyverse verb.
library(tidyverse)
library(rlang)
df <- tibble(col1 = c("A", "B", "C"),
col2 = c("D", "E", "F"),
col3 = c("G", "H", "I"),
col4 = c("J", "K", "L"))
> df
# A tibble: 3 × 4
col1 col2 col3 col4
<chr> <chr> <chr> <chr>
1 A D G J
2 B E H K
3 C F I L
Most of the time, the dataframe(s) I'll be using have the same column names, so setting them as defaults will save coding. This works, but seems excessively verbose for all the use of enquo()
, expr()
, !!
, and eval_tidy()
in this small example:
myFun3 <- function(df, var1 = col1, var2 = col2){
var1 <- enquo(var1)
var2 <- enquo(var2)
var3 <- expr(c(!!var1, !!var2))
#print(var3)
out <- tibble(var_out = eval_tidy(var3, df))
return(out)
}
> myFun3(df)
# A tibble: 6 × 1
var_out
<chr>
1 A
2 B
3 C
4 D
5 E
6 F
> myFun3(df, col3, col4) # For when I have column names that aren't my defaults
# A tibble: 6 × 1
var_out
<chr>
1 G
2 H
3 I
4 J
5 K
6 L
If I throw a print(var3)
into the function and rerun it, I can see the expression is c(~col1, ~col2)
, and I initially thought I could shorten the function like this:
myFun4 <- function(df, var1 = col1, var2 = col2){
var3 <- expr(c(ensym(var1), ensym(var2))) # ditched the enquo() and tried directly inserting parameter values with ensym()
print(var3)
out <- tibble(var_out = eval_tidy(var3, df))
return(out)
}
> myFun4(df)
c(ensym(var1), ensym(var2))
# A tibble: 2 × 1
var_out
<list>
1 <sym>
2 <sym>
As you can see above, my failure above is that symbols never get evaluated as they are preserved by the expr()
. I was closer before when I kept var1
and var2
as quosures.
Can myFun3()
be written in a more concise manner than I have done? I'm focused on reading how this works in rlang
, but the problem I'm showing above is fundamentally base R (concatenating two columns of a dataframe). In other circumstances, I am writing functions using dplyr
verbs, so I am thinking that staying with tidy evaluation is appropriate here, but maybe I should be doing this above with base R NSE (? -- Would that make a difference?). Thank you for any clarity on my efforts above.
Here are a couple of options to get the ball rolling. You could turn the unquoted variable names into quoted variable names and then us df[[var1]]
or something like that. That's what I do in myFun3()
below.
library(tidyverse)
library(rlang)
#>
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#>
#> %@%, flatten, flatten_chr, flatten_dbl, flatten_int, flatten_lgl,
#> flatten_raw, invoke, splice
df <- tibble(col1 = c("A", "B", "C"),
col2 = c("D", "E", "F"),
col3 = c("G", "H", "I"),
col4 = c("J", "K", "L"))
myFun3 <- function(df, var1 = col1, var2 = col2){
var1 <- as_label(enquo(var1))
var2 <- as_label(enquo(var2))
out <- tibble(var_out = c(df[[var1]], df[[var2]]))
return(out)
}
myFun3(df)
#> # A tibble: 6 × 1
#> var_out
#> <chr>
#> 1 A
#> 2 B
#> 3 C
#> 4 D
#> 5 E
#> 6 F
You also don't have to leave the dplyr
world, because for this particular problem, you could use reframe()
to make the new dataset. Whether this works as well your intended real-world scenario, I'm not sure. That's what I do in myFun3b()
below:
myFun3b <- function(df, var1 = col1, var2 = col2){
out <- df %>% reframe(var_out = c({{var1}}, {{var2}}))
return(out)
}
myFun3b(df)
#> # A tibble: 6 × 1
#> var_out
#> <chr>
#> 1 A
#> 2 B
#> 3 C
#> 4 D
#> 5 E
#> 6 F
You could also do it by using the inject operator (!!
) on an ensym()
using with(df, ...)
as in myFun3c()
.
myFun3c <- function(df, var1 = col1, var2 = col2){
out <- tibble(var_out = with(df, c(!!ensym(var1), !!ensym(var2))))
return(out)
}
myFun3c(df)
#> # A tibble: 6 × 1
#> var_out
#> <chr>
#> 1 A
#> 2 B
#> 3 C
#> 4 D
#> 5 E
#> 6 F
Created on 2023-08-16 with reprex v2.0.2