rperformanceloopsfor-loopcoding-efficiency

What is the fastest R way to generate random variates of the Bates distribution?


It is a simple task to get random numbers of the Bates distribution. I need 1 million averages to run 10k times:

bates1 = replicate(10000, mean(runif(1e+06,1,5)))
summary(bates1)

I waited forever to complete its calcs. I tried for loops also with no avail (infinitesimally slow).

Any way out of this?

I tried the for loop,

set.seed(999)
for (i in 1:10000) {
x <- randomLHS(1e+6,1)
x <- 1 + 4*x
y[i] <- mean(x)
}
summary(y)

And before the code, allocating space for x and y (using length() ).


Solution

  • There are lots of ways to do parallel computation in R. You could look at:

    As an example using the doParallel library on my work machine (a modest Surface Book 2):

    library(doParallel)
    
    registerDoParallel(7)
    
    # original version
    system.time ( { replicate(10000, mean(runif(1e+06,1,5))) } )
     user  system elapsed 
     319.70   20.36  340.39
    
    # parallel version 7 cores
     system.time( { times(10000) %dopar% mean(runif(1e+06,1,5)) } )
      user  system elapsed 
      6.06    1.14  125.75 
    

    So around 2 minutes, as opposed to a bit over 5 minutes (not exactly "forever" but long enough).

    Some of these other answers may also help.