javacollectionshashmapbuckets

hash() implementation in java


I'm going through HashMap class in Java. My understanding is that capacity of hash table is 2 to the power of number of buckets(capacity 16 means four buckets). When put(key,value) is called, key.hashCode() outputs an Integer number and this newly added (key,value) pair is placed based on key.hashCode()%number of buckets. But the following is the actual implementation in HashMap.class

 static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

From the above code, I'm not able to figure out how does the fitting of key.hashCode() value into buckets happen.


Solution

  • That code does not "fit" the hashCode to buckets, it "just" spreads the hashcode making the upper bits more significant. Here the javadoc of that method.

    Computes key.hashCode() and spreads (XORs) higher bits of hash to lower. Because the table uses power-of-two masking, sets of hashes that vary only in bits above the current mask will always collide. (Among known examples are sets of Float keys holding consecutive whole numbers in small tables.) So we apply a transform that spreads the impact of higher bits downward. There is a tradeoff between speed, utility, and quality of bit-spreading. Because many common sets of hashes are already reasonably distributed (so don't benefit from spreading), and because we use trees to handle large sets of collisions in bins, we just XOR some shifted bits in the cheapest possible way to reduce systematic lossage, as well as to incorporate impact of the highest bits that would otherwise never be used in index calculations because of table bounds.

    The actual fitting to buckets is done in the getNode(int, Object) method:

    first = tab[(n - 1) & hash]
    

    where hash is the result of hash(Object) and n is the size of the hashtable.