algorithmhashprobing

Why does this hash table lookup probe like it does?


This code (extraced from an LZW compression program of unknown origin) finds an empty slot in a hash table of size 5021, indexed from 0 to 5020:

probe := <random 12-bit hash key>
// probe is initially 0 to 4095
repeat
  {
  if table[probe] is empty then return(probe);
  if probe == 0 then probe := -1 else dec(probe, 5021-probe);
  if probe < 0 then inc(probe, 5021);
  }

This isn't quite the typical linear or quadratic probing. Why probe like that? Is this a known probing algorithm, and where can I find out more about it?


Solution

  • The algorithm for calculating the new probe is simple despite its ugly looks:

    if probe == 0
        probe <= 5020
    else
        probe <= (2*probe) % 5021
    

    It is not quite clear why this function has been picked, but it really does go through all possible positions 1..5020 in a cyclic and seemingly random way (and 0 is sent back to the loop). (No, it doesnt, see the oops!) It should be noted that number 5021 is slightly magical in this context.

    This algorithm is actually a linear congruential generator (LCG). See http://en.wikipedia.org/wiki/Linear_congruential_generator

    OOPS: It is a LCG, but not one with the optimal period, because 2^1004 % 5021 == 1. There are five different cycles with the following members: (1, 2, 4, 8, ...), (3, 6, 12, 24, ...), (7, 14, 28, 56, ...), (9, 18, 36, 72, ...), and (11, 22, 44, 88, ...). Even more odd someone has chosen to use this algorithm. (Or then I analysed it wrong.)