rpurrrrlangtidyeval

using tidy evaluation (curly-curly) with purrr


How can I iterate over functions that use tidy evaluation (via rlang's curly-curly)?

Here's a basic example, where I attempt to iterate over column names using purrr::map():

library("dplyr")
library("purrr")

myfunc <- function(by) {
  mtcars %>% count({{ by }})
}

# passing unquoted arguments doesn't work
c(cyl, vs, am, gear) %>% map(~ myfunc(by = .x))
#> Error in eval(expr, envir, enclos): object 'cyl' not found

# passing quoted arguments also doesn't work
c("cyl", "vs", "am", "gear") %>% map(~ myfunc(by = .x))
#> Error in `map()`:
#> ℹ In index: 1.
#> Caused by error in `count()`:
#> ! Must group by variables found in `.data`.
#> ✖ Column `.x` is not found.

Created on 2024-03-18 with reprex v2.1.0

How to get around this problem?


Solution

  • When you try to create a vector of unquoted symbols c(cyl, vs, am, gear), R will try to evaluate this before the next step in the pipeline. None of cyl, vs, am or gear exist in the global environment, so this will fail.

    If you want to do things this way, you can explicitly tell R that these arguments shouldn't be evaluated until later, which can be done using rlang::quos():

    library(dplyr)
    library(purrr)
    
    myfunc <- function(by) {
      mtcars %>% count({{ by }})
    }
    
    # passing unquoted arguments doesn't work
    rlang::quos(cyl, vs, am, gear) %>% map(~ myfunc(by = !!.x))
    #> [[1]]
    #>   cyl  n
    #> 1   4 11
    #> 2   6  7
    #> 3   8 14
    #> 
    #> [[2]]
    #>   vs  n
    #> 1  0 18
    #> 2  1 14
    #> 
    #> [[3]]
    #>   am  n
    #> 1  0 19
    #> 2  1 13
    #> 
    #> [[4]]
    #>   gear  n
    #> 1    3 15
    #> 2    4 12
    #> 3    5  5
    

    Created on 2024-03-18 with reprex v2.1.0

    Note that you'll also need to use !! for this to work - otherwise R will look for a column called .x.

    A more idiomatic approach would be to use a character vector and to subset the .data pronoun like so:

    c("cyl", "vs", "am", "gear") %>% 
      map(~ myfunc(by = .data[[.x]]))
    

    You could also use across() combined with all_of():

    c("cyl", "vs", "am", "gear") %>% 
      map(~ myfunc(by = across(all_of(.x))))
    

    As you can see, there are lots of approaches you can take - this is called non-standard evaluation (NSE), and it's a broad topic. To learn more I'd recommend reading vignette("programming", package = "dplyr") and the set of articles on metaprogramming from the {rlang} documentation.