rloadworkspacemclapply

Load different workspaces with the same variable names without overwriting existing objects


I have a pipeline that requires loading several .RData files. However, these files all contain the same variable names (say, ls() = c(df1, df2)), and since these files are big, I decided to use mclapply(c(a.RData, b.RData, c.RData), load, .GlobalEnv, mc.cores = parallel::detectCores()) to save time. However, as they have the same names, the df1 df2 will be overwritten. So, is there a way to solve this?

I was thinking:

  1. Can I change the variable name before loading in R? The .RData are from other people's pipeline, so I can't let them change the variable now, so I am wondering if there is a way to change the .RData's variable name before outside R or before loading.

  2. If the first one is impossible, how to write an argument that can detect if the variable is about to be overwritten because of the duplicated name, thus automatically rename the variable.


Solution

  • Load into a separate environment:

    # some script that outputs RData with x object
    x <- head(mtcars)
    save.image("temp.RData")
    
    # another script with different x value
    x <- 1:3
    
    # now load our RData into new separate environment
    e1 <- new.env(parent = baseenv())
    load("temp.RData", envir = e1)
    
    x
    # [1] 1 2 3
    
    e1$x
    #                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
    # Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
    # Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
    # Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
    # Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
    # Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
    # Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1