rnse

Passing a variable/column name as an argument/parameter to a function


This is a beginner's question, but after researching and trying for a day, I seem to be unable to find the answer myself (and I apologize if this question is a duplicate, but I have not found a solution). I would like to pass the name of a variable (column in a dataframe) to a function. I suppose that the passed string is not simply a string, but a (more complex) object, which might be why the following minimal code fails:

dfprod <- data.frame("demand" = c(59000, 51000, 40000, 29000, 20000))

my_fun <- function(x_var) {
  mean(dfprod$x_var)
}

my_fun(demand)

I have tried numerous solution, such as embracing {{ x_var }}, using eval, paste, substitute - all to no avail. Could you point me in the right direction? The string "demand", passed to the function, should be replaced for the string "x_var" in the function.

When I run the code from above, R (4.3.3 on Windows 11 with RStudio) gives me:

Warning message:
In mean.default(dfprod$x_var) :
  argument is not numeric or logical: returning NA

The desired output is:

> my_fun(demand)
[1] 39800

Solution

  • You can’t use $ with nonstandard evaluation in the way you’re attempting. Instead, you can use eval() and pass your dataframe as the evaluation environment:

    my_fun <- function(x_var) {
      mean(eval(substitute(x_var), envir = dfprod))
    }
    
    my_fun(demand)
    # [1] 39800
    

    Or similarly, using rlang:

    library(rlang)
    
    my_fun <- function(x_var) {
      mean(eval_tidy(ensym(x_var), data = dfprod))
    }
    
    my_fun(demand)
    # [1] 39800
    

    Also, it’s generally better practice to explicitly pass the dataframe as a function argument (eg, my_fun <- function(x_var, data)) than to rely on an object in the global environment.