It is difficult to debug the error code of mclapply because all values of a job are affected.
I prepared a simple example.
library(parallel)
library(dplyr)
data(iris)
## Parallel Version
parFun <- function(i){
print(i)
## Generate a random subset of the iris data set
daf <- iris[sample(1:nrow(iris),10),]
## Bug in iteration number of 39, some internal function returned NULL
if(i == 39){
daf <- NULL
}
## Dplyr produces an error, needs an if test for NULL
res <- daf %>% group_by("Species") %>% slice_min(order_by = Petal.Width, n = 2)
return(res)
}
## Do the call which returns error code
## Scheduled core 3 encountered error in user code, all values of the job will be affected
resList <- mclapply(1:50,parFun,mc.cores=12)
idx <- sapply(resList,function(x){is.null(nrow(x))})
## Depending on the number of cores a sequence of jobs is affected
which(idx == TRUE)
How to debug such code for several 1000 iterations ? How to find the single i that causes the error ?
In your call of mclapply, wrap up parFun with a tryCatch block. Codes below:
resList <- mclapply(
1:50,
function(iter) tryCatch(
parFun(iter),
error = function(e) e
),
mc.cores=parallel::detectCores()
)