Why is this R computation slower on multiple cores and faster on a single core?

Two matrices

library(parallel)
m <- matrix(1:12000000000,  nrow=300000)
p <- matrix(21:32, nrow=3)

# Use all pairings of i and j
i_vec <- rep(seq_len(ncol(m)), times = ncol(m))
j_vec <- rep(seq_len(ncol(m)), each = ncol(m))

multicore

system.time(mcmapply(i_vec, j_vec, 
   FUN = function(i, j) {
     if (i <= j) return(0)
     sqrt(sum(m[,i]) * sum(m[,j]) * sum(p[,i]) * sum(p[,j]))
   }, mc.cores=7))

single core

system.time(mapply(i_vec, j_vec, 
      FUN = function(i, j) {
      if (i <= j) return(0)
      sqrt(sum(as.numeric(m[,i])) * sum(as.numeric(m[,j])) * sum(as.numeric(p[,i])) * sum(as.numeric(p[,j])))
                 }))

Running this calculation with seven cores in mcmapply yields

  user  system elapsed 
 0.014   0.485   0.019

and with 1 core in mapply gives

 user  system elapsed 
0.008   0.000   0.008

and specifying 1 core for mcmapply gives

 user  system elapsed 
0.007   0.000   0.007

I can't figure out why this is slower for the multi core than the single core. Is it because the calculation is not very computationally expensive?

Solution

When you parallise code you always get some overhead. With the very simple workload here, the overhead is larger than the workload. If you use a workload that takes some time, e.g. Sys.sleep(0.1), you should see the speedup due to multi-core computation.