c++multithreading atomic memory-barriers stdatomic

C++ atomics acquire/release and RMW - can one acquire load sync with multiple release RMWs?

Threads A, B, C are doing separate work (no synchronizing is required between them). Once all three complete, thread D will combine their results. So D depends on the completion of A, B and C.

int a = 0;
int b = 0;
int c = 0;
std::atomic_int D_dependencies{ 3 };

thread A:

a = 1;
D_dependencies.fetch_sub(1, std::memory_order_release);

thread B:

b = 1;
D_dependencies.fetch_sub(1, std::memory_order_release);

thread C:

c = 1;
D_dependencies.fetch_sub(1, std::memory_order_release);

thread D:

if(D_dependencies.load(std::memory_order_acquire) == 0)
{
    assert(a + b + c == 3);
}

My understanding is that RMW operations like fetch_sub form a "release sequence" and so the load in thread D should observe all writes if it loads 0 from the atomic variable.
Am I correct?

Solution

Yes, that's correct.

There are three overlapping release-sequences, so the acquire-load syncs-with all three of the release-RMWs. The RMWs include release so they can each head their own release-sequence as well as being part of a longer sequence. (acq_rel or seq_cst also include release and would work here.)

The guarantees in the standard apply for every case where the conditions apply - release store (including as part of an RMW), zero or more intervening RMW operations (of any memory_order), then an acquire operation syncs-with the original release operation that it saw a value from (or a value dependent on it via a chain of RMWs).

In the formalism of the standard, each release operation heads its own release sequence, and thus you can have overlapping release sequences. (I think; I didn't double-check the standard's wording.)
It also works to think about a chain of RMWs as one release sequence, and acquire operations syncing with every release-or-stronger operation in the chain.

A pure store breaks a release sequence, but you don't have those on D_dependencies.

What does "release sequence" mean?
Why release sequence can only contain read-modify-write but not pure write (and see my comments there)