rlazy-evaluationenvironments

Can you more clearly explain lazy evaluation in R function operators?


If I create a function as follows:

what_is_love <- function(f) {
  function(...) {
    cat('f is', f, '\n')
  }
}

And call it with lapply: funs <- lapply(c('love', 'cherry'), what_is_love)

I get unexpected output:

> funs[[1]]()
f is cherry
> funs[[2]]()
f is cherry

But note that this is not the case when you do not use lapply:

> f1 <- what_is_love('love')
> f2 <- what_is_love('cherry')
> f1()
f is love
> f2()
f is cherry

What gives?

I know that funs <- lapply(c('love', 'cherry'), what_is_love) can be written out more fully:

params <- c('love', 'cherry')
out <- vector('list', length(params))
for (i in seq_along(params)) {
  out[[i]] <- what_is_love(params[[i]])
}
out

But when I browse in, I see that both functions have their own environment:

Browse[1]> out[[1]]
function(...) {
    cat('f is', f, '\n')
  }
<environment: 0x109508478>
Browse[1]> out[[2]]
function(...) {
    cat('f is', f, '\n')
  }
<environment: 0x1094ff750>

But in each of those environments, f is the same...

Browse[1]> environment(out[[1]])$f
[1] "cherry"
Browse[1]> environment(out[[2]])$f
[1] "cherry"

I know the answer is "lazy evaluation", but I'm looking for a bit more depth... how does f end up re-assigned across both environments? Where does f come from? How does R lazy evaluation work under the hood in this example?

-

EDIT: I'm aware of the other question on lazy evaluation and functionals, but it just says the answer is "lazy evaluation" without explaining how the lazy evaluation actually works. I'm seeking greater depth.


Solution

  • When you do

    what_is_love <- function(f) {
      function(...) {
        cat('f is', f, '\n')
      }
    }
    

    the inner function creates an enclosure for f, but the catch is that until you actually use a variable passed to a function, it remains a "promise" and is not actually evaluated. If you want to "capture" the current value of f, then you need to force the evaluation of the promise; you can use the force() function fo this.

    what_is_love <- function(f) {
      force(f)
      function(...) {
        cat('f is', f, '\n')
      }
    }
    funs <- lapply(c('love', 'cherry'), what_is_love)
    
    funs[[1]]()
    # f is love 
    funs[[2]]()
    # f is cherry 
    

    Without force(), f remains a promise inside both of the functions in your list. It is not evaluated until you call the function, and when you call the function that promise is evaluated to the last known value for f which is "cherry."

    As @MartinMorgran pointed out, this behavior has changed in R 3.2.0. From the release notes

    Higher order functions such as the apply functions and Reduce() now force arguments to the functions they apply in order to eliminate undesirable interactions between lazy evaluation and variable capture in closures. This resolves PR#16093.