rtimeparallel-foreachcpu-time

R How to get total CPU time with foreach?


I am trying to get total CPU hours of a code run in parallel (using foreach from the package doParallel) but I'm not sure how to go about doing this. I have used proc.time() but it just returns a difference in 'real' time. From what I have read of system.time(), it should also just do the same as proc.time(). How do I get total CPU hours of an R code run in parallel?


Solution

  • A Little trick is to return the measured runtime with your computation result together by list. An example as below, we use system.time() to get the runtime as same as proc.time().

    NOTE: this is the modified example from my blog post of R with Parallel Computing from User Perspectives.

    # fake code to show how to get runtime of each process in foreach
    library(foreach)
    library(doParallel)
    
    # Real physical cores in my computer
    cores <- detectCores(logical = FALSE)
    cl <- makeCluster(cores)
    registerDoParallel(cl, cores=cores)
    
    
    system.time(
      res.gather <- foreach(i=1:cores, .combine='list') %dopar%
      {  
        s.time <- system.time( {
        set.seed(i)
        res <- matrix(runif(10^6), nrow=1000, ncol=1000)
        res <- exp(sqrt(res)*sqrt(res^3))
        })
        list(result=res, runtime=s.time)
      }
    )
    
    
    stopImplicitCluster()
    stopCluster(cl)
    

    Thus, the runtime is saved in res.gather and you can get it easily. So, add them up and we can know how many total time for your parallel program.

    > res.gather[[1]]$runtime
       user  system elapsed 
       0.42    0.04    0.48 
    > res.gather[[2]]$runtime
       user  system elapsed 
       0.42    0.03    0.47 
    > res.gather[[2]]$runtime[3] + res.gather[[2]]$runtime[3]
    elapsed 
       0.94 
    

    Finally, the runtime of 2 R sessions is 0.94 sec without accounting wait time of R master.