haskellhashhashmaphashableunordered-containers

Is this approach to dealing with hash collisions new/unique?


When dealing with hash maps, I have seen a few strategies to deal with hash collisions, but we have come up with something different. I was wondering if this is something new or not.

This version of a hash map only works if the hash and the data structures that will be hashed are salteable. (This is the case in hashable in Haskell, where we suggested implementing this approach.)

The idea is that, instead of storing a list or array in each cell of the hash map, you store a recursive hash map. The only difference in this recursive hash map is that you use a different salt. This way the hash collisions on one level of the hash map are most likely not hash collisions on the next level. As a result, insertion into such a hash map is no longer O(number of collisions on this hash) but O(number of levels that the collisions on this happen at recursively), which is most likely better.

A more detailed explanation and an implementation can be found here:

https://github.com/tibbe/unordered-containers/pull/217/files/58af4519ace34c5f7d3c1359907ff75e27b9cdb8#diff-ba23e0f18c79cb873ac5375367524cfaR114


Solution

  • Your idea seems to be effectively the same as the one suggested in the Fredman, Komlós & Szemerédi paper from 1984. As Wikipedia summarizes it:

    FKS Hashing makes use of a hash table with two levels in which the top level contains n buckets which each contain their own hash table.

    In contrast to your idea, the local hash maps aren't recursive, instead each of them chooses a salt that makes it a perfect hash. In practice, this will (as you say) usually be given already by the first salt you try, thus it's asymptotically constant-time.