hashknuth

Hashing - M should be a power of two


I have heard that m should be a power of two in knuth multiplicative hash. Otherwise, a power of two is always a good choice. Could somebody please tell me in an easy way why this is more efficient?

Kind regards


Solution

  • For context, the general form of the Knuth multiplicative hash is this:

    hash formula

    If w = 232 and M is 2bits, then this simplifies to

    h(K) = A * K >> (32 - bits)
    

    Which is obviously really nice. The trick is to leave the division by w for later, use mod w (which is automatic), then extract from the top however many bits we would have gotten out if it was done the normal way (this corresponds to the division by w, scaling back by M, and doing the floor - all at once).

    But that trick relies on w and M being powers of two. If M is not a power of two, there would be an other fixed-point multiplication (instead of just a right shift) to map the intermediary result from
    [0 .. 232-1] into [0 .. M-1], and since M would not divide 232 that would also introduce a bias into the distribution.