In the example Josh gives of the flawed random method that generates a positive random number with a given upper bound n
, I don't understand the two of the flaws he states.
The method from the book is:
private static final Random rnd = new Random();
//Common but deeply flawed
static int random(int n) {
return Math.abs(rnd.nextInt()) % n;
}
Random.nextInt()
says Returns the next pseudorandom, uniformly distributed int value from this random number generator's sequence.
So shouldn't it be that if n is a small integer then the sequence will repeat itself, why does this only apply to powers of 2?Random.nextInt()
generates random integers that are uniformly distributed? (He provides a code snippet which clearly demonstrates this but I don't understand why this is the case, and how this is related to n being a power of 2).Question 1: if n is a small power of 2, the sequence of random numbers that are generated will repeat itself after a short period of time.
This is not a corollary of anything Josh is saying; rather, it is simply a known property of linear congruential generators. Wikipedia has the following to say:
A further problem of LCGs is that the lower-order bits of the generated sequence have a far shorter period than the sequence as a whole if m is set to a power of 2. In general, the n-th least significant digit in the base b representation of the output sequence, where bk = m for some integer k, repeats with at most period bn.
This is also noted in the Javadoc:
Linear congruential pseudo-random number generators such as the one implemented by this class are known to have short periods in the sequence of values of their low-order bits.
The other version of the function, Random.nextInt(int)
, works around this by using different bits in this case (emphasis mine):
The algorithm treats the case where n is a power of two specially: it returns the correct number of high-order bits from the underlying pseudo-random number generator.
This is a good reason to prefer Random.nextInt(int)
over using Random.nextInt()
and doing your own range transformation.
Question 2: Next he says that if n is not a power of 2, some numbers will be returned on average more frequently than others.
There are 232 distinct numbers that can be returned by nextInt()
. If you try to put them into n buckets by using % n
, and n isn't a power of 2, some buckets will have more numbers than others. This means that some outcomes will occur more frequently than others even though the original distribution was uniform.
Let's look at this using small numbers. Let's say nextInt()
returned four equiprobable outcomes, 0, 1, 2 and 3. Let's see what happens if we applied % 3
to them:
0 maps to 0
1 maps to 1
2 maps to 2
3 maps to 0
As you can see, the algorithm would return 0 twice as frequently as it would return each of 1 and 2.
This does not happen when n is a power of two, since one power of two is divisible by the other. Consider n=2
:
0 maps to 0
1 maps to 1
2 maps to 0
3 maps to 1
Here, 0 and 1 occur with the same frequency.
Additional resources
Here are some additional -- if only tangentially relevant -- resources related to LCGs: