rperformancereplicate

Is there a faster way to write code to draw many random samples from 1:N?


I'm drawing many sorted samples of size k (typically ~5) from 1:35. The code I am using is:

replicate(1000000, sort(sample(1:35, size = 5)) )

But this is rather slow. I assume that since I am using replicate, no parallelisation is happening. Is there a faster way to write this code?


Solution

  • TTBOMK, replicate is a wrapper for sapply, which itself is a wrapper for lapply. There are plenty of options; simple might be:

     library(future.apply)
    #> Loading required package: future
     plan(multisession, workers = 6L)
     i = seq(35L)
     system.time({ X = future_replicate(1e6, sample(i, 5L, replace = TRUE)) |> t() })
    #>    user  system elapsed 
    #>   3.122   0.141   4.619
     head(X)
    #>      [,1] [,2] [,3] [,4] [,5]
    #> [1,]   26   18   31    9   35
    #> [2,]   25   10   10   24   16
    #> [3,]    9    9   35    4   25
    #> [4,]   27   32   26   15   27
    #> [5,]   13    2    6   26   24
    #> [6,]   26   23   12    7   26
     library(Rfast)
     rowSort(head(X))
    #>      [,1] [,2] [,3] [,4] [,5]
    #> [1,]    9   18   26   31   35
    #> [2,]   10   10   16   24   25
    #> [3,]    4    9    9   25   35
    #> [4,]   15   26   27   27   32
    #> [5,]    2    6   13   24   26
    #> [6,]    7   12   23   26   26