data-structureshashhashtableclrsdouble-hashing

Average time complexity of open addressing


From CLRS book analysis:

11.6: Given an open-address hash table with load factor α=n/m<1 the expected number of probes in an unsuccessful search is at most 1/1-α assuming uniform hashing.

11.7: Inserting an element into an open-address hash table with load factor α requires at most 1/1-α probes on average, assuming uniform hashing.

11.8: Given an open-address hash table with load factor α<1, the expected number of probes in a successful search is at most (1/α)ln(1/1-α) assuming uniform hashing and assuming that each key in the table is equally likely to be searched for.

After reading the chapter I don't understand if the averege time complexity of insert and search is O(1/1-α) or O(1) or something else.

Looking for an explanation.


Solution

  • Let’s focus on the O(1 / (1 - α)) bit for now. If you have any fixed value of α (say, α = 0.75), then 1 / (1 - α) will be a constant, and the entire expression will be O(1). So in that sense, all of the above runtimes that depend purely on α can be interpreted as “these are all O(1) for any fixed value of α.”

    So why not just say “O(1)” in CLRS? The reason is that the more precise analyses in terms of α provide guidance about how the runtimes of the hash table will change as α changes. For example, suppose you want to raise α from 0.90 to 0.99. Knowing that the runtime is O(1 / (1 - α)) then tells you than you should expect to see a 10x slowdown in performance. However, moving α from 0.50 to 0.59 will have a markedly smaller impact on performance.