I have a pipeline that requires loading several .RData
files. However, these files all contain the same variable names (say, ls() = c(df1, df2)
), and since these files are big, I decided to use mclapply(c(a.RData, b.RData, c.RData), load, .GlobalEnv, mc.cores = parallel::detectCores())
to save time. However, as they have the same names, the df1 df2
will be overwritten. So, is there a way to solve this?
I was thinking:
Can I change the variable name before loading in R? The .RData
are from other people's pipeline, so I can't let them change the variable now, so I am wondering if there is a way to change the .RData
's variable name before outside R or before loading.
If the first one is impossible, how to write an argument that can detect if the variable is about to be overwritten because of the duplicated name, thus automatically rename the variable.
Load into a separate environment:
# some script that outputs RData with x object
x <- head(mtcars)
save.image("temp.RData")
# another script with different x value
x <- 1:3
# now load our RData into new separate environment
e1 <- new.env(parent = baseenv())
load("temp.RData", envir = e1)
x
# [1] 1 2 3
e1$x
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1