rforeachparallel-processingfeature-selectiondomc

doMC: Only together with foreach?


I want to run an R script to use the reversed feature elimination from the caret package on a computer cluster. Ideally I would like to run it on multiple cores in parallel. In the script of a coworker, I found the use of the doMC package. I read that this package is used together with the foreach package. But in the script I got, there is simply the library imported and in the line before the rfe command there is a registerDoMC(5). There is not a single use of foreach in the whole script.

Will the doMC do anything here or does it only work together with foreach?

Is there a way to distribute the resource consuming rfe process on multiple cores?


Solution

  • Read the documentation:

    rfe can be used with "explicit parallelism", where different resamples (e.g. cross-validation group) can be split up and run on multiple machines or processors. By default, rfe will use a single processor on the host machine. As of version 4.99 of this package, the framework used for parallel processing uses the foreach package. To run the resamples in parallel, the code for rfe does not change; prior to the call to rfe, a parallel backend is registered with foreach (see the examples below).

    So, caret::rfe uses foreach internally.