javamultithreadinglistenerrace-conditionhappens-before

Java happens-before relationship for listener implementation


Consider the following part of a Java application

final List<String> strings = new CopyOnWriteArrayList<>();
volatile Consumer<? super String> listener;

void add(String s) {
    strings.add(s);
    Consumer<? super String> l = listener;
    if (l != null) {
        l.accept(s);
    }
}

The add(String s) method may be invoked from multiple threads and adds s to a CopyOnWriteArrayList (could also be any other thread-safe Collection) and if there is a listener, it is invoked with s as argument. Note that listener is volatile.

At some point in time any thread invokes the following setupListener() method once.

void setupListener() {
    listener = System.out::println;
    strings.forEach(System.out::println);
}

The setupListener() method sets a listener and also handles the previously missed invocations of the listener by iterating over the added strings.

My question is, assuming that the setupListener() method is called once at some point in time, is it possible with this code to miss a strings entry, i.e. a strings entry might not be printed due to a race condition?

Note that I'm aware that strings entries might be printed twice (or not in a particular order) in case of a race condition. This is allowed and I don't care about that. It is only important to not miss any. Also note that this question is not about synchronization or other types of locking, I can't change the upper code and I don't invoke it. I can only set the listener.

Regarding the question, I think that no strings entries will be missed for the following reason: Assume that a String s has been missed. This implies that listener was read as null in the add method, otherwise it would have been printed. However, since

s must be printed when iterating over the strings, since strings contains s. This violates the assumption that s has been missed.

Is this argument correct?


Solution

  • I'm pretty sure you're safe. But your argument is missing some important details.

    There is a problem in your statement:

    to a CopyOnWriteArrayList (could also be any other thread-safe Collection)

    This indicates a misunderstanding of how things work. Thread-safe is not a black and white situation. You can't say 'this collection is thread safe' and 'this collection is not thread safe'. thread safety lies not in objects/classes, but in actions. Some actions can be thread safe when others aren't. For example, trivially, this:

    if (!map.containsKey(key)) {
      map.put(key, calculate(key));
    }
    

    is not thread safe. No matter how thread safe your underlying map reference is, for obvious reasons.

    What's 'thread safe' with COWList is modifying the list while another thread is iterating over it, due to its fundamental property (the 'copy on write' part, thus, ensuring all iterators are impervious to their underlying list being modified, because that never happens). What also happens to be thread safe is 2 threads attempting to simultaneously add to it.

    The add() code of COWList specifically acquires a lock that is internal to the COWList instance. Straight from the source:

            synchronized (lock) {
                Object[] es = getArray();
                int len = es.length;
                es = Arrays.copyOf(es, len + 1);
                es[len] = e;
                setArray(es);
                return true;
            }
    

    A Happens-Before relationship is established between a synchronized block exiting, and another thread entering one on the same lock object later (HB is what is required for any changes to a field done by code A to definitely be observable by code B; without an HB relationship a JVM is free to make that field write visible or not - either behaviour is acceptable, therefore, if your code depends on that, you've written a bug. A nasty, very very difficult to test one!)

    With that lock in place, the act of 'call add from 2 threads' is thread safe.

    It's a little trickier when investigating iterator(), which is what for (String s : strings) ends up calling:

        public Iterator<E> iterator() {
            return new COWIterator<E>(getArray(), 0);
        }
        /** The array, accessed only via getArray/setArray. */
        private transient volatile Object[] array;
    
        /**
         * Gets the array.  Non-private so as to also be accessible
         * from CopyOnWriteArraySet class.
         */
        final Object[] getArray() {
            return array;
        }
    

    Uhoh! No locking on lock, therefore, any HB established by add does not interact at all with the process of obtaining an iterator. You don't get the benefits of synchronized except in regards to interactions with other threads that also synchronized on the same thing. The volatile on the array is what saves you here, and only because COWList specifically never writes to arrays (it copies the entire array and updates the reference, and it is that reference that is volatile, not the array contents itself).

    volatile also establishes Happens-Before.