I have a vector V containing 36 elements, 18 are "0" and 18 are "1". I would like to compute N random (not the first N) permutation of this vector.
I could do as follow:
library(combinat)
N <- 100 # or 200, 300, 500... max 1000
V <- c(rep(0, 18), rep(1, 18))
n <- factorial(36) # total number of unique possible permutations
p <- unique(permn(V))[sample(1:n, N)]
But I quickly run into the combinatorial explosion problem, as
sample(1:n, N)
returns Error in 1:n : result would be too long a vector
and
permn(V)
returns Error in vector("list", gamma(n + 1)) : vector size specified is too large
Is there another (better) way to do this?
For quickly generating many permutations, the Rfast
package has colShuffle
that will permute each column of a matrix. (It also has rowShuffle
for row-wise permutations, but that function currently works properly only for square matrices.)
For 10 samples of the OP's vector:
n <- 10 # number of samples
colShuffle(matrix(rep(0:1, 18), 36, n))
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 0 0 0 1 1 0 1 0 1 0
#> [2,] 1 0 1 1 1 0 0 0 1 0
#> [3,] 0 1 0 1 1 0 1 1 1 1
#> [4,] 1 0 1 0 0 0 1 1 1 1
#> [5,] 0 1 0 1 1 0 0 0 0 1
#> [6,] 0 0 1 1 0 1 1 1 1 1
#> [7,] 1 0 0 0 1 1 0 1 0 0
#> [8,] 0 1 1 1 1 1 0 1 1 0
#> [9,] 1 1 0 0 0 1 1 0 1 0
#> [10,] 1 0 0 0 0 0 1 0 0 0
#> [11,] 0 1 1 1 0 1 0 1 1 1
#> [12,] 1 1 0 0 0 0 1 1 1 1
#> [13,] 1 1 1 0 0 0 1 0 1 1
#> [14,] 1 1 0 1 0 1 0 0 1 1
#> [15,] 0 0 1 1 1 0 0 0 0 1
#> [16,] 1 0 0 0 1 1 0 0 0 0
#> [17,] 0 0 0 0 0 1 0 0 0 1
#> [18,] 1 1 0 1 0 1 1 0 0 1
#> [19,] 0 0 1 1 1 0 1 1 0 0
#> [20,] 0 1 0 1 0 1 0 1 0 0
#> [21,] 0 1 0 0 0 0 0 1 1 1
#> [22,] 1 0 1 1 0 1 1 0 0 1
#> [23,] 0 0 1 0 1 1 1 1 0 0
#> [24,] 0 1 1 0 0 1 0 0 0 1
#> [25,] 0 1 0 0 1 1 1 0 0 0
#> [26,] 1 0 0 0 1 0 1 0 1 1
#> [27,] 1 0 1 0 1 0 1 1 1 0
#> [28,] 1 1 1 1 0 0 1 0 0 1
#> [29,] 1 1 1 1 0 1 0 1 0 0
#> [30,] 0 0 1 0 0 0 1 1 0 1
#> [31,] 0 1 0 1 1 1 1 0 1 0
#> [32,] 1 1 0 1 1 1 0 1 0 1
#> [33,] 1 0 0 0 1 0 0 1 1 0
#> [34,] 1 1 1 0 0 0 0 1 1 0
#> [35,] 0 0 1 0 1 0 0 0 0 0
#> [36,] 0 0 1 1 1 1 0 1 1 0
Benchmarking:
n <- 1e3
microbenchmark::microbenchmark(
RfastCol = colShuffle(matrix(rep(0:1, 18), 36, n)),
RcppAlgos = permuteSample(0:1, freqs = c(18, 18), n = n),
arrangements = permutations(0:1, freq = c(18, 18), nsample = n)
)
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> RfastCol 577.8 708.30 852.679 753.80 800.30 9889.8 100
#> RcppAlgos 15150.7 15297.70 15484.137 15383.30 15590.70 17604.2 100
#> arrangements 89671.4 90162.85 91229.152 90733.90 91576.95 102990.4 100
NB1: RcppAlgos::permuteSample
samples the permutations without replacement.
NB2: Rfast::colShuffle
does not respect the global RNG (see comments).