rfunctionfor-loopuser-defined-functionsr-formula

Problem using function with "as.formula" in a loop in R


I've created a function that returns an ANOVA table, and it uses formula to create the formula of the oneway.testfunction.

A simplified version of the function is:

anova_table <- function(df, dv, group){
  
  dv_t <- deparse(substitute(dv))
  group_t <- deparse(substitute(group))
  
  anova <- oneway.test(formula = formula(paste(dv_t, "~", group_t)),
                       data = df,
                       var.equal = F)
  return(anova)
}

It works fine when I use it outside a loop:

data("mpg")
mpg <- mpg %>% mutate_if(is.character, as.factor)

anova_table(mpg, displ, drv)

However, I'd like it to work also inside a loop. When I try the following code, I get this error message: "Error in model.frame.default(formula = formula(paste(dv_t, "~", group_t)), : object is not a matrix"

I'm not sure what I'm doing wrong.

vars_sel <- mpg %>% select(where(is.numeric)) %>% names()
vars_sel <- dput(vars_sel)
vars_sel <- syms(vars_sel)

for(i in vars_sel){
  var <- sym(i)
  print(anova_table(mpg, var, drv))
}

Any help would be much appreciated!


Solution

  • Because of how your function works, the var in your loop is being taken literally, so the function is looking for a column called var in mpg which doesn't exist. You can get round this by building and evaluating a call to your function in the loop:

    for(i in vars_sel){
      a <- eval(as.call(list(anova_table, df = mpg, dv = i, group = quote(drv))))
      print(a)
    }
    #> 
    #>  One-way analysis of means (not assuming equal variances)
    #> 
    #> data:  displ and drv
    #> F = 143.9, num df = 2.000, denom df = 67.605, p-value < 2.2e-16
    #> 
    #> 
    #>  One-way analysis of means (not assuming equal variances)
    #> 
    #> data:  year and drv
    #> F = 0.59072, num df = 2.000, denom df = 67.876, p-value = 0.5567
    #> 
    #> 
    #>  One-way analysis of means (not assuming equal variances)
    #> 
    #> data:  cyl and drv
    #> F = 129.2, num df = 2.000, denom df = 82.862, p-value < 2.2e-16
    #> 
    #> 
    #>  One-way analysis of means (not assuming equal variances)
    #> 
    #> data:  cty and drv
    #> F = 89.54, num df = 2.000, denom df = 78.879, p-value < 2.2e-16
    #> 
    #> 
    #>  One-way analysis of means (not assuming equal variances)
    #> 
    #> data:  hwy and drv
    #> F = 127.14, num df = 2.000, denom df = 71.032, p-value < 2.2e-16
    

    Created on 2022-09-25 with reprex v2.0.2