javaconcurrenthashmapphantom-reference

Why does Java's Cleaner use a linked list instead of a ConcurrentHashSet?


Java's Cleaner under the hood uses a doubly linked list to hold PhantomReferences until their referents become phantom reachable. When the Cleaner's daemon thread pops and removes a PhantomReference from its ReferenceQueue and removes it from the list, the whole list must be locked (the removal code is @synchronized on the list).

In contrast, a ConcurrentHashMap could still offer O(1) access to PhantomReferences, but also support concurrent modifications by letting multiple cleaner threads remove PhantomReferences (as keys) from the map at the same time.

Is there a reason Java went with the doubly linked list approach? Is it because a Cleaner only uses a single daemon thread for cleanup so lack of concurrency doesn't matter? Or is there a deeper reason for that choice?


Solution

  • Performance using the doubly linked list is much better than using a ConcurrentHashMap because the cleaner works with the node objects of the doubly linked list.

    Removing a node is just adjusting two pointers in the list (one from the previous node, one from the next node) and adjusting two pointers in the current node (to mark it as removed). The source code for this is 12 lines of code (https://github.com/openjdk/jdk21u/blob/master/src/java.base/share/classes/jdk/internal/ref/PhantomCleanable.java#L101).

    Compare these 12 lines with the work that ConcurrentHashMap needs to do to remove a single entry (https://github.com/openjdk/jdk21u/blob/master/src/java.base/share/classes/java/util/concurrent/ConcurrentHashMap.java#L1110): that are more than 70 lines of code and much more complex code.

    Both operations are O(1), but that doesn't mean that ConcurrentHashMap needs just the same amount of time - it only means that in both datastructures the runtime doesn't increase when it contains more data.