rr-environment

Separate scripts from .GlobalEnv: Source script that source scripts


This question is similar to Source script to separate environment in R, not the global environment, but with a key twist.

Consider a script that sources another script:

# main.R
source("funs.R")
x <- 1
# funs.R
hello <- function() {message("Hi")}

I want to source the script main.R and keep everything in a "local" environment, say env <- new.env(). Normally, one could call source("main.R", local = env) and expect everything to be in the env environment. However, that's not the case here: x is part of env, but the function hello is not! It is in .GlobalEnv.

Question: How can I source a script to a separate environment in R, even if that script itself sources other scripts, and without modifying the other scripts being sourced?

Thanks for helping, and let me know if I can clarify anything.

EDIT 1: Updated question to be explicit that scripts being source cannot be modified (assume they are not under your control).


Solution

  • You can use trace to inject code in functions, so you could force all source calls to set local = TRUE. Here I just override it if local is FALSE in case any nested calls to source actually set it to other environments due to special logic of their own.

    env <- new.env()
    
    # use !isTRUE if you want to support older R versions (<3.5.0)
    tracer <- quote(
      if (isFALSE(local)) {
        local <- TRUE
      }
    )
    
    trace(source, tracer, print = FALSE, where = .GlobalEnv)
    
    # if you're doing this inside a function, uncomment the next line
    #on.exit(untrace(source, where = .GlobalEnv))
    
    source("main.R", local = env)
    

    As mentioned in the code, if you wrap this logic in a function, consider using on.exit to make sure you untrace even if there are errors.

    EDIT: as mentioned in the comments, this could have issues if some of the scripts you will be loading assume there is 1 (global) environment where everything ends. I suppose you could change the tracer to something like

    tracer <- quote(
      if (missing(local)) {
        local <- TRUE
      }
    )
    

    or maybe

    tracer <- quote(
      if (isFALSE(local)) {
        # fetch the specific environment you created
        local <- get("env", .GlobalEnv)
      }
    )
    

    The former assumes that if the script didn't specify local at all, it doesn't care about which environment ends up holding everything. The latter assumes that source calls that didn't specify local or set it to FALSE want everything to end up in 1 environment, and modify the logic to use your environment instead of the global one.