rforeachparallel-processingdomc

whether to import "parallel" package in R when using foreach


I am using foreach() function in the foreach R package for parallel computing. Besides that function, I think it is also required to use registerDoMC() function in the doMC package.

However, when I write my DESCRIPTION file, the Imports section contains doMC (>= 1.3.0), foreach (>= 1.4.1), but when I run my code, an error indicates: cannot find the iter function. Thus, I also import the iterators package.

It seems that there is still error: the mclapply() function is to be used by foreach(), and this function appears in both the parallel and the multicore package. I include both packages in the Imports section, but when I run search(), the warnings show up:

Warning messages:
1: replacing previous import ‘mclapply’ when loading ‘parallel’ 
2: replacing previous import ‘mcparallel’ when loading ‘parallel’ 
3: replacing previous import ‘pvec’ when loading ‘parallel’ 

This is pretty weird: even though I explicitly imports both packages of iterators and multicore, I still cannot use their functions after loading my own package... Instead, I have to explicitly run:

library(iterators)
library(multicore)

in order to use my own function in my package which makes use of parallel computing. Is there anything wrong in my package writing? Thank you so much!


Solution

  • If you modify your DESCRIPTION file by adding doMC to the "Depends", then the "cannot find the iter function" error should go away, and functions from foreach, iterators and doMC will be available when your package is loaded, which seems to be your preference. The first chapter of Writing R Extensions discusses the differences between "Imports" and "Depends". Generally, it's preferable to use "Imports" to avoid forcing users of your package to load packages that are only needed within a package, but it has uses.

    Actually, the "cannot find the iter function" error that you saw is caused by a bug in the doMC package, and using "Depends" rather than "Imports" works around this bug. Your package should only have to import packages that it directly uses, so if you don't explicitly call iter or mclapply, you shouldn't have to import iterators, parallel, or multicore. And since parallel has subsumed multicore, you should never import both parallel and multicore, which should avoid the warning messages that you saw.

    I submitted a fix for the doMC bug to the package maintainer, so you should be able to import foreach and doMC into packages without an error in the next version of the package.