The classic Fisher Yates looks something like this:
void shuffle1(std::vector<int>& vec)
{
int n = vec.size();
for (int i = n - 1; i > 0; --i)
{
std::swap(vec[i], vec[rand() % (i + 1)]);
}
}
Yesterday, I implemented the iteration "backwards" by mistake:
void shuffle2(std::vector<int>& vec)
{
int n = vec.size();
for (int i = 1; i < n; ++i)
{
std::swap(vec[i], vec[rand() % (i + 1)]);
}
}
Is this version in any way worse (or better) than the first? Does it skew the resulting probabilities?
Yes it's even distribution assuming rand()
is. We will prove this by showing that each input can generate each permutation with equal probability.
N=2 can be easily proven. We will draw it as a tree where the the children represent each string you can get by inserting the character after comma into the leftmost string.
0,1 //input where 0,1 represent indices
01 10 //output. Represents permutations of 01. It is clear that each one has equal probability
For N, we will have every permutations for N-1, and randomly swapping the last character for N
(N-1 0th permutation),N ..... (N-1 Ith permutation),N ________________________
/ \ / \ \
0th permutation of N 1st permutation.... (I*N)th permutation ((I*N)+1)th permutation .... (I*N)+(I-1)th permutation
This shitty induction should lead you to it having even distribution.
Example:
N=2:
0,1
01 10 // these are the permutations. Each one has equal probability
N=3:
0,1|2 // the | is used to separate characters that we will insert later
01,2 10,2 // 01, 10 are permutations from N-1, 2 is the new value
210 021 012 201 120 102 // these are the permutations, still equal probability
N=4: (curved to aid reading)
0,1|23
01,2|3 10,2|3
012,3 021,3 210,3 102,3 120,3 201,3
0123 0132 0321 3230 2013 2031 2310 3012
0213 0231 0312 3210 1203 1230 1302 3201
2103 2130 2301 3102 1023 1032 1320 3021
etc