.netmultithreadingconcurrencythread-safetyimmutability

ReadOnlyCollection<T> Thread Safety


The documentation for ReadOnlyCollection(of T) states that:

A ReadOnlyCollection(Of T) can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. To guarantee thread safety during enumeration, you can lock the collection during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

My question regards the bolded part:

  1. why is enumerating through a collection intrinsically not thread-safe
  2. what are the possible implications, and
  3. what are commonly-used workarounds?

Solution

  • C# has a pretty good an okay-ish collection model, but the ReadOnlyCollection class is one of the most unfortunately conceived (or named) classes in the entire model. It should have been more appropriately called a readonly list, not a readonly collection.

    Now, to get to your question, it is just a read-only wrapper of an IList supplied at construction-time. (That's why it is called "read-only" instead of "immutable".) The code which constructed the ReadOnlyCollection might keep a reference to the original mutable list, and may modify that list, with all the consequences that this will have for multithreaded access.

    Enumerating through the collection would have been thread-safe if the collection was immutable; but since it just a read-only wrapper of a possibly mutable list, it can only be as thread safe as a mutable list, which is not at all thread safe.

    Enumerating through a mutable collection is intrinsically not thread-safe for two sets of reasons:

    1. The same reasons that even in a single-threaded scenario you cannot modify a collection while enumerating it. (If you try this, you will get a ConcurrentModificationException in Java, InvalidOperationException in C#.) This is because the enumerator maintains some state, (for example, an index along an internal array,) but this state may be rendered invalid by a mutation to the original collection. (For example, a shortening of the array may leave that index pointing past the end of the array.) If you only have one thread, you can make sure that you do not mutate it while enumerating it, but in a multi-threaded scenario one thread may be enumerating the collection while another thread may be altering it, and you have no control over that.

    2. The same reasons why anything mutable (and non-atomic) is intrinsically not thread-safe. (And a collection is a complex thing, so it is certainly not atomic, the way a plain integer is atomic.) When one thread is reading data and another thread is writing the same data, the data appears corrupt to the reading thread. The simplest example of this is modifying a 64-bit integer on a 32-bit architecture, where such integers are non-atomic. A thread modifying the 64-bit integer may be pre-empted by another thread after the first half of the integer has been written, but before the second half of the integer has been written. The pre-empting thread will read a half-written integer, which will be garbage. The effects of this are much more pronounced when we are talking about entire arrays instead of just integers.

    As for the workaround that you asked, well, you can either use locking, or you can go with the lock-nothing (or lock-as-little-as-possible) principle and work on a real copy of the original mutable list instead of a wrapper around it.