I wanted to load multiple RData in one command, as explained by Johua using
> lapply(c(a_data, b_data, c_data, d_data), load, .GlobalEnv)
[[1]]
[1] "nRTC_Data"
[[2]]
[1] "RTA_Data"
[[3]]
[1] "RTC_Data"
[[4]]
[1] "RTA_Data"
> rm(a_data, b_data, c_data, d_data); ls()
[1] "nRTC_Data" "RTA_Data" "RTAC_data" "RTC_Data"
However, since my RData are big, and I found no time improvement between lappy()
and multiple load()
, I decided to use multi-core approach like following:
library(parallel)
mclapply(c(a_data, b_data, c_data, d_data),load,.GlobalEnv, mc.cores = parallel::detectCores())
Though this significantly improved the loading time, also returns the list
[[1]]
[1] "nRTC_Data"
[[2]]
[1] "RTA_Data"
[[3]]
[1] "RTC_Data"
[[4]]
[1] "RTA_Data"
In my workspace, nothing is found
> rm(a_data, b_data, c_data, d_data); ls()
character(0)
I also tried replacing .GlobalEnv
by environment()
, but still didn't work.
Any one has a clue?
FYI, you can try with following commands:
> a = "aa";save(a, file = "aa.RData")
> b = "bb";save(b, file = "bb.RData")
> c = "cc";save(c, file = "cc.RData")
> d = "dd";save(d, file = "dd.RData")
> # lapply approach
> rm(list = ls())
> a = "aa.RData"; b = "bb.RData"; c = "cc.RData"; d = "dd.RData"
> lapply(c(a, b, c, d), load, .GlobalEnv); rm(a, b, c, d)
> # mclapply approach
> rm(list = ls())
> a = "aa.RData"; b = "bb.RData"; c = "cc.RData"; d = "dd.RData"
> mclapply(c(a, b, c, d), load, .GlobalEnv, mc.cores = parallel::detectCores()); rm(a, b, c, d)
I think it's because when using mclapply
the underlying forking creates separate processes. In the code below I use mclapply
with myload
function that loads the Rdata file and returns the object loaded. The difference with your lapply
version is that you have the data in the list returned by mclapply
myload <- function(x){
x <- load(x)
get(x)
}
a = "aa.RData"; b = "bb.RData"; c = "cc.RData"; d = "dd.RData"
res <- mclapply(c(a, b, c, d), myload, mc.cores = parallel::detectCores());