I'm reading the AdvancedR by Hadley and am testing the following code on this URL
subset2 = function(df, condition){
condition_call = eval(substitute(condition),df )
df[condition_call,]
}
df = data.frame(a = 1:10, b = 2:11)
condition = 3
subset2(df, a < condition)
Then I got the following error message:
Error in eval(substitute(condition), df) : object 'a' not found
I read the explanation as follows but don't quite understand:
If eval() can’t find the variable inside the data frame (its second argument), it looks in the environment of subset2(). That’s obviously not what we want, so we need some way to tell eval() where to look if it can’t find the variables in the data frame.
In my opinion, while "eval(substitute(condition),df )", the variable they cannot find is condition, then why object "a" cannot be found?
On the other hand, why the following code won't make any error?
subset2 = function(df, condition){
condition_call = eval(substitute(condition),df )
df[condition_call,]
}
df = data.frame(a = 1:10, b = 2:11)
y = 3
subset2(df, a < y)
This more stripped down example may make it easier for you to see what's going on in Hadley's example. The first thing to note is that the symbol condition
appears here in four different roles, each of which I've marked with a numbered comment.
## Role of symbol `condition`
f <- function(condition) { #1 -- formal argument
a <- 100
condition + a #2 -- symbol bound to formal argument
}
condition <- 3 #3 -- symbol in global environment
f(condition = condition + a) #4 -- supplied argument (on RHS)
## Error in f(condition = condition + a) (from #1) : object 'a' not found
The other important thing to understand is that symbols in supplied arguments (here the right hand side part of condition = condition + a
at #4
) are searched for in the evaluation frame of the calling function. From Section 4.3.3 Argument Evaluation of the R Language Definition:
One of the most important things to know about the evaluation of arguments to a function is that supplied arguments and default arguments are treated differently. The supplied arguments to a function are evaluated in the evaluation frame of the calling function. The default arguments to a function are evaluated in the evaluation frame of the function.
In the example above, the evaluation frame of the call to f()
is the global environment, .GlobalEnv
.
Taking this step by step, here is what happens when you call (condition = condition + a)
. During function evaluation, R comes across the expression condition + a
in the function body (at #2
). It searches for values of a
and condition
, and finds a locally assigned symbol a
. It finds that the symbol condition
is bound to the formal argument named condition
(at #1
). The value of that formal argument, supplied during the function call, is condition + a
(at #4
).
As noted in the R Language Definition, the values of the symbols in the expression condition + a
are searched for in the environment of the calling function, here the global environment. Since the global environment contains a variable named condition
(assigned at #3
) but no variable named a
, it is unable to evaluate the expression condition + a
(at #4
), and fails with the error that you see.