rpurrrtidyevaltidyselect

Tidyevaluation pattern to map over selected columns


outer_func uses tidy evaluation for col_rfs and dflt_flag, i.e. it is possible to do col_rfs = dplyr::starts_with("col").

library(magrittr)

outer_func <- function(dat, col_rfs, dflt_flag){
  
  
  col_rfs_str <- tidyselect::eval_select(rlang::enquo(col_rfs), dat) %>% names
  freqs <- purrr::map(col_rfs_str, \(x) get_freq(dat, !!rlang::sym(x), {{dflt_flag}}))
  
  return(freqs)
  
}

get_freq <- function(dat, rf, dflt_flag){
  
  freq <- dat %>%
    dplyr::group_by({{rf}}) %>% 
    dplyr::summarise(count = dplyr::n(), dr = mean({{dflt_flag}}), .groups = "drop")
  
  return(freq)
}

dat <- tibble::tibble(dflt = sample(c(0,1), size = 1000, replace = TRUE), rf_1 =  sample(c("a","b"), size = 1000, replace = TRUE), rf_2 =  sample(c(1,2), size = 1000, replace = TRUE))

outer_func(dat, dplyr::starts_with("rf"), dflt)

What is a good pattern which avoids converting col_rfs to a string via tidyselect::eval_select(rlang::enquo(col_rfs), dat) %>% names and then back to an unquoted column via !!rlang::sym(x) when using purrr::map?

It is important to keep the inputs to outer_func & get_freq as unquoted strings.


Solution

  • You can use .data[[x]] like in the following

    outer_func <- function(dat, col_rfs, dflt_flag){
        
        dat |> 
            dplyr::select(all_of(col_rfs)) |> 
            names() |> 
            purrr::map(function(x) get_freq(dat, .data[[x]], {{dflt_flag}}))
        
    }
    

    This gives the same result as your function.

    About once a week I end up referring to the Programming with dplyr article.