pythonalgorithmdictionarylinear-probing

In Python Dictionaries, how does ( (j*5)+1 ) % 2**i cycle through all 2**i


I am researching how python implements dictionaries. One of the equations in the python dictionary implementation relates the pseudo random probing for an empty dictionary slot using the equation

j = ((j*5) + 1) % 2**i

which is explained here.

I have read this question, How are Python's Built In Dictionaries Implemented?, and basically understand how dictionaries are implemented.

What I don't understand is why/how the equation:

j = ((j*5) + 1) % 2**i   

cycles through all the remainders of 2**i. For instance, if i = 3 for a total starting size of 8. j goes through the cycle:

0
1
6
7
4
5
2
3
0

if the starting size is 16, it would go through the cycle:

0 1 6 15 12 13 2 11 8 9 14 7 4 5 10 3 0

This is very useful for probing all the slots in the dictionary. But why does it work ? Why does j = ((j*5)+1) work but not j = ((j*6)+1) or j = ((j*3)+1) both of which get stuck in smaller cycles.

I am hoping to get a more intuitive understanding of this than the equation just works and that's why they used it.


Solution

  • This is the same principle that pseudo-random number generators use, as Jasper hinted at, namely linear congruential generators. A linear congruential generator is a sequence that follows the relationship X_(n+1) = (a * X_n + c) mod m. From the wiki page,

    The period of a general LCG is at most m, and for some choices of factor a much less than that. The LCG will have a full period for all seed values if and only if:

    1. m and c are relatively prime.
    2. a - 1 is divisible by all prime factors of m.
    3. a - 1 is divisible by 4 if m is divisible by 4.

    It's clear to see that 5 is the smallest a to satisfy these requirements, namely

    1. 2^i and 1 are relatively prime.
    2. 4 is divisible by 2.
    3. 4 is divisible by 4.

    Also interestingly, 5 is not the only number that satisfies these conditions. 9 will also work. Taking m to be 16, using j=(9*j+1)%16 yields

    0 1 10 11 4 5 14 15 8 9 2 3 12 13 6 7
    

    The proof for these three conditions can be found in the original Hull-Dobell paper on page 5, along with a bunch of other PRNG-related theorems that also may be of interest.