rparallel-processingmulticoretorquemlr

R package mlr exhausts memory with multicore


I am trying to run a reproducible example with the mlr R package in parallel, for which I have found the solution of using parallelStartMulticore (link). The project runs with packrat as well.

The code runs properly on workstations and small servers, but running it in an HPC with the torque batch system runs into memory exhaustion. It seems that R threads are spawned ad infinitum, contrary to regular linux machines. I have tried to switch to parallelStartSocket, which works fine, but then I cannot reproduce the results with RNG seeds.

Here is a minimal example:

library(mlr)
library(parallelMap)
M <- data.frame(x = runif(1e2), y = as.factor(rnorm(1e2) > 0))

# Example with random forest 
parallelStartMulticore(parallel::detectCores())
plyr::l_ply(
  seq(100), 
  function(x) {
    message("Iteration number: ", x)

    set.seed(1, "L'Ecuyer")
    tsk <- makeClassifTask(data = M, target = "y")

    num_ps <- makeParamSet(
      makeIntegerParam("ntree", lower = 10, upper = 50), 
      makeIntegerParam("nodesize", lower = 1, upper = 5)
    )
    ctrl <- makeTuneControlGrid(resolution = 2L, tune.threshold = TRUE)

    # define learner
    lrn <- makeLearner("classif.randomForest", predict.type = "prob")
    rdesc <- makeResampleDesc("CV", iters = 2L, stratify = TRUE)

    # Grid search in parallel
    res <- tuneParams(
      lrn, task = tsk, resampling = rdesc, par.set = num_ps, 
      measures = list(auc), control = ctrl)

    # Fit optimal params
    lrn.optim <- setHyperPars(lrn, par.vals = res$x)
    m <- train(lrn.optim, tsk)

    # Test set
    pred_rf <- predict(m, newdata = M)

    pred_rf
  }
)
parallelStop()

The hardware of the HPC is an HP Apollo 6000 System ProLiant XL230a Gen9 Server blade 64-bit, with Intel Xeon E5-2683 processors. I ignore if the issue comes from the torque batch system, the hardware or any flaw in the above code. The sessionInfo() of the HPC:

R version 3.4.0 (2017-04-21)                                                                                                                                                       
Platform: x86_64-pc-linux-gnu (64-bit)                                                                                                                                             
Running under: CentOS Linux 7 (Core)                                                                                                                                               

Matrix products: default                                                                                                                                                           
BLAS/LAPACK: /cm/shared/apps/intel/parallel_studio_xe/2017/compilers_and_libraries_2017.0.098/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so                                          

locale:                                                                                                                                                                            
[1] C                                                                                                                                                                              

attached base packages:                                                                                                                                                            
[1] stats     graphics  grDevices utils     datasets  methods   base                                                                                                               

other attached packages:                                                                                                                                                           
[1] parallelMap_1.3   mlr_2.11          ParamHelpers_1.10 RLinuxModules_0.2

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14        splines_3.4.0       munsell_0.4.3      
 [4] colorspace_1.3-2    lattice_0.20-35     rlang_0.1.1        
 [7] plyr_1.8.4          tools_3.4.0         parallel_3.4.0     
[10] grid_3.4.0          packrat_0.4.8-1     checkmate_1.8.2    
[13] data.table_1.10.4   gtable_0.2.0        randomForest_4.6-12
[16] survival_2.41-3     lazyeval_0.2.0      tibble_1.3.1       
[19] Matrix_1.2-12       ggplot2_2.2.1       stringi_1.1.5      
[22] compiler_3.4.0      BBmisc_1.11         scales_0.4.1       
[25] backports_1.0.5  

Solution

  • The "multicore" parallelMap backend uses parallel::mcmapply which should create a new fork()ed child process for every evaluation inside tuneParams and then quickly kill that process. Depending on what you use to count memory usage / active processes, it is possible that memory gets mis-reported and that child processes that are already dead (and were only alive for the fraction of a second) are shown, or that killing of finished processes for some reason does not happen.

    Possible problems: