rdplyrtidyeval

Using mutate inside a function to assign a new variable


I'm trying to better understand tidy evaluation and the use of rlang, but there's a specific use case I can't figure out. Let's say I want to run a linear mixed model and save the output of one predictor that interests me. Running this code outside of a function would look like this:

library(tidyverse)
library(lme4)
library(lmerTest)
library(broom.mixed)

tibble %>% 
      lmer(y ~ x*z + (1|ID), data = .) %>% 
      tidy() %>% 
      filter(term == "x:z_2") %>%
      select(term, estimate, p.value) %>% 
      mutate(var = "y") %>% 
      relocate(var)

I want to wrap this operation in a function where I can specify the dependent variable and the term by which I filter. I’ve managed to call the linear mixed model in this way:

between_group_lmm <- function(df, yvar) {
  ysym <- rlang::ensym(yvar)
  
  # Construct and inject the formula dynamically into lmer
  rlang::inject(
    lmer(!!ysym ~ time * group + (1 | sno), data = df)
  ) %>%
    tidy()
}

However, I don’t understand how to use tidy evaluation for the function’s arguments that I want to pass as strings. Additionally, I’d prefer to specify the filtering value (e.g., "x:z_2" in my example) without using quotation marks, so that both the dependent variable and filtering term can be called consistently without parentheses.

desired outcome

between_group_lmm(df, y, x:z_2)

How can I use tidy evaluation with rlang to handle both the dependent variable (for both use cases) and filtering term as function arguments?


Solution

  • You can use deparse(substitute(yvar)) to capture the expression as a string:

    between_group_lmm <- function(df, yvar, filter_var) {
      
      ysym <- rlang::ensym(yvar)
      
      filter_var <- deparse(substitute(filter_var))
      
      yvar <- deparse(substitute(yvar))
      
      rlang::inject(lmer(!!ysym ~ time * group + (1 | sno), data = df)) %>%
        tidy() %>%
        filter(term == filter_var) %>%
        select(term, estimate, p.value) %>% 
        mutate(var = yvar) %>% 
        relocate(var)
    }
    

    Or, if you want to stick to rlang syntax, you can use enquo and as_label:

    between_group_lmm <- function(df, yvar, filter_var) {
      
      ysym <- rlang::ensym(yvar)
      
      yvar <- rlang::enquo(yvar) %>% rlang::as_label()
      
      filter_var <- rlang::enquo(filter_var) %>% rlang::as_label()
      
      rlang::inject(lmer(!!ysym ~ time * group + (1 | sno), data = df)) %>%
        tidy() %>%
        filter(term == filter_var) %>%
        select(term, estimate, p.value) %>% 
        mutate(var = yvar) %>% 
        relocate(var)
    }
    

    Both of these versions do the same thing, so that if your data is something like this:

    set.seed(1)
    
    tib <- tibble(sno = rep(1:10, 10),
                  time = rnorm(100, 10, 2), 
                  group = factor(sample(10, 100, TRUE)),
                  yvar = time * as.numeric(group)/180 + sno + rnorm(100))
    

    Then you can do

    between_group_lmm(tib, outcome, time:group6)
    #> # A tibble: 1 x 4
    #>   var   term        estimate p.value
    #>   <chr> <chr>          <dbl>   <dbl>
    #> 1 y     time:group6   0.0891   0.864