javacollectionshashmapconcurrenthashmap

What is the difference between Segment of ConcurrentHashMap and buckets of HashMap theoretically?


I understand that in HashMap, the entries (Key, Value) are placed in buckets based on hash(Key.hashCode)--> The index that denotes the bucket location. In case an entry is already placed at that location, there is a linked list created and the new entry (if it has a different key --> via equals() method) is placed at the beginning of the linked list.

  1. Can I co-relate this concept with that of ConcurrentHashMap, but instead of Buckets, there are Segments upon which individual threads have a lock. And instead of Entries, there are HashEntry(ies). In similar fashion, a linked list is created and if the Key-Value pair being inserted is different, based on equals() of the key, it is placed at end of the linked list.
  2. Am I correct when I say that: put of CHM is not synchronized, thus any thread can access this method, this put method calculates hash value of the key passed to it and gets the segment index (kind of like buckets). Then for that segment only, it calls the put method. Now under Segment, the put method specifies that there will be a lock(), so that only one thread can alter data in a particular segment, thus concluding that if the concurrency level is 16 there shall be 16 threads and thus these threads will be able to PUT values only one segment at a time.

Solution

    1. A bucket is an individual slot in the map's array. This is the same with both HashMap and ConcurrentHashMap. Conceptually, the latter has its array broken into segments (each segment is an array of references), but that's it. Note that the CHM in Java 8 no longer has segments, it's all a single array.

    2. Yes, it's the scheme known as segmented locking. It reduces inter-thread contention, but does not eliminate it.