javathread-safetyhashmapwriteonly

Java thread-safe write-only hashmap


In my Java class I include a Hashmap variable (class property) and run some Threads which write-only into that HashMap using put(): each time the write happens it stores a unique key (which is done by design).

Is the synchronized keyword on a class method write-only sufficient for thead-safe conditions? My HashMap is simple and not a ConcurrentHashMap?


Solution

  • No, it is not sufficient to only synchronize the writes. Synchronization must be applied to both reads and writes to memory.

    Some other thread, somewhere, sometime, will need to read the map (otherwise, why have a map?), and that thread needs to be synchronized to correctly view the memory represented by the map. They also need to be synchronized to avoid tripping over transient inconsistencies in the map state as it's being updated.

    To provide a hypothetical example, suppose Thread 1 writes the hashmap, the effects of which are stored in CPU 1's level 1 cache only. Then Thread 2, becomes eligible to run a few seconds later and is resumed on CPU 2; it reads the hashmap, which comes from CPU 2's level 1 cache - it does not see the writes that Thread 1 made, because there was no memory barrier operation between the write and the read in both the writing and the reading thread. Even if Thread 1 synchronizes the writes, then although the effect of the writes will be flushed to main memory, Thread 2 will still not see them because the read came from level 1 cache. So synchronizing writes only prevents collisions on writes.

    Besides the CPU caching the JMM allows threads to cache data privately themselves which only has to be flushed to main memory at a memory barrier (synchronize, volatile with some special limitations, or completion of construction of an immutable object in JMM 5+).

    To fully understand this complex subject of threading you must research and study the Java Memory Model, and it's implications for sharing data between threads. You must understand the concepts of "happens-before" relationships and memory visibility to understand the complexities of sharing data in today's world of multicore CPUs with various levels of CPU caching.

    If you don't want to invest the time to understand the JMM, the simple rule is that two threads must somewhere/somehow synchronize on the same object between the writes and the reads for one thread to see the effects of the operations of the other. Period. Note that this doesn't mean that all writes and reads on an object must be synchronized, per se; it is legitimate to create and configure an object in one thread and then "publish" it to other threads, as long as the publishing thread and fetching thread(s) synchronize on the same object for the hand over.