rwindowsparallel-processingrandom-forestpartykit

How can R package partykit function cforest be used with applyfun argument for parallel processing with multiple cores on Windows


The cforest function in the R partykit package has an argument applyfun, which the docs indicate can be used to apply "an optional lapply-style function with arguments function(X, FUN, ...)". I understand that mclapply does not function on Windows OS, and parLapply is the 'drop-in' parallel alternative for lapply on Windows, which has the form parLapply(cl = NULL, X, fun, ...) as per the cforest documentation.

Some rudimentary code to try this out:

library(partykit)
library(parallel)

nCores <- detectCores()
clust <- makeCluster(nCores)

data(iris)
rf_model = cforest(Species~., data=iris, applyfun=parLapply(clust))

gives

Error in cforest(Species ~ ., data = iris, applyfun = parLapply(clust)) : unused argument (applyfun = parLapply(clust))

How can the parallel functions be used correctly as input arguments within the cforest function on Windows?


Solution

  • Does the following work for you?

    nCores <- detectCores()
    clust <- makeCluster(nCores)
    
    parLapplyClust <- function(X, FUN, ...) {
      parLapply(clust, X, FUN, ...)
    }
    
    data(iris)
    rf_model <- cforest(Species~., data=iris, applyfun=parLapplyClust)
    
    stopCluster(clust)
    

    I think you need to give a function to applyfun, not the result of a function.