rforeachdomc

R foreach that includes for loops weird behavior


I am trying to build a parallel foreach loop using DoMC but there are some odd behaviors going on. The code looks like this

for (file in files) {
do stuff
for (extra in extras) {
do some heavy stuff
}
}

What I want is to parallelize the top loop and not the inner second. Anyone knows what's going on? I have used foreach and doMC in the past and never had this issue before.


Solution

  • It looks like you have a few things going on, but there is not enough here to be sure:

    If you are using this from RStudio it may not work well, that is a stated limitation of doMC. Try running it straight out of R 64 bit.

    You need to require(doMC) or library(doMC) call the package, but you also need to register it with your machine or it will not work right

    registerDoMC(4) 
    

    That 4 is telling it how many cores to run. If you say nothing it TRIES to use 1/2 of your core.

    And you do not have complete code above, the appropriate format is:

    foreach(file in files) %dopar% { stuff to do }

    You must expressly tell it to do parallel processing using the %dopar% command. if you want to use all cores in one area and not in others, then you need to set options to tell it how many cores for the separate parts of you function or code. But if you tell and outer loop to use 4 and an inner loop to use 2 it may be slower than setting it to 4 in the outer loop and letting it manage things itself. I am not 100% clear on how it accomplishes hand-offs, experiment to see.

    To change the number of cores, just add this line:

    options(cores=2)

    I hope this helps!