I followed this article to create a crate function which calls a user function which itself calls another user function. I supplied both user functions to crate. This seems to work fine as the user functions show up in the printout of the crate function. But when I use the crate function in future_map the inner user function cannot be found.
library(furrr)
future::plan(multisession, workers = 3)
inner_foo <- function(x){
x^2
}
outer_foo <- function(x){
inner_foo(x)
}
outer_crate <- carrier::crate(
function(x) outer_foo(x),
outer_foo = outer_foo,
inner_foo = inner_foo
)
# this works
outer_crate(3)
# shows that inner_foo is packaged with outer_crate
outer_crate
# does not work as inner_foo not found
future_map(1:3, outer_crate,
.options = furrr_options(globals = FALSE))
# works if inner_foo is manually supplied
future_map(1:3, outer_crate,
.options = furrr_options(globals = "inner_foo"))
This problem only occurs when there are workers set with plan
, otherwise it works as expected.
This is a known issue. It happens because the inner_foo()
function lives in R's global environment. The global environment is special, because its content is not carried along when exporting objects to parallel workers.
The solution is to have inner_foo()
live in the environment of the function that uses it, i.e. the environment of outer_foo()
.
There are two ways to achieve this. The first approach is:
outer_foo <- local({
inner_foo <- function(x){
x^2
}
function(x){
inner_foo(x)
}
})
The second approach is to "fix it up" after creating the functions:
inner_foo <- function(x){
x^2
}
outer_foo <- function(x){
inner_foo(x)
}
environment(outer_foo) <- new.env(parent = environment(outer_foo))
environment(outer_foo)$inner_foo <- inner_foo
Ideally, the Futureverse would do this automagically, but it's a complex problem with its own issues, so that's still on the roadmap.