javaoptimizationsynchronizationlockingtrove4j

What is the name of this locking technique?


I've got a gigantic Trove map and a method that I need to call very often from multiple threads. Most of the time this method shall return true. The threads are doing heavy number crunching and I noticed that there was some contention due to the following method (it's just an example, my actual code is bit different):

synchronized boolean containsSpecial() {
   return troveMap.contains(key);
}

Note that it's an "append only" map: once a key is added, is stays in there forever (which is important for what comes next I think).

I noticed that by changing the above to:

boolean containsSpecial() {
    if ( troveMap.contains(key) ) {
        // most of the time (>90%) we shall pass here, dodging lock-acquisition
        return true;
    }
    synchronized (this) {
        return troveMap.contains(key);
    }
}

I get a 20% speedup on my number crunching (verified on lots of runs, running during long times etc.).

Does this optimization look correct (knowing that once a key is there it shall stay there forever)?

What is the name for this technique?

EDIT

The code that updates the map is called way less often than the containsSpecial() method and looks like this (I've synchronized the entire method):

synchronized void addSpecialKeyValue( key, value ) {
    ....
}

Solution

  • This code is not correct.

    Trove doesn't handle concurrent use itself; it's like java.util.HashMap in that regard. So, like HashMap, even seemingly innocent, read-only methods like containsKey() could throw a runtime exception or, worse, enter an infinite loop if another thread modifies the map concurrently. I don't know the internals of Trove, but with HashMap, rehashing when the load factor is exceeded, or removing entries can cause failures in other threads that are only reading.

    If the operation takes a significant amount of time compared to lock management, using a read-write lock to eliminate the serialization bottleneck will improve performance greatly. In the class documentation for ReentrantReadWriteLock, there are "Sample usages"; you can use the second example, for RWDictionary, as a guide.

    In this case, the map operations may be so fast that the locking overhead dominates. If that's the case, you'll need to profile on the target system to see whether a synchronized block or a read-write lock is faster.

    Either way, the important point is that you can't safely remove all synchronization, or you'll have consistency and visibility problems.