What is the effect of read-modify-write intermediates using std::memory_order_acq_rel on a release sequence

The answer to this question, Acqrel memory order with 3 threads, states that the thread performing the final acquire-load will synchronize with each thread in a release sequence if the intermediate RMW uses a release ordering. And it will not synchronize with any threads where the intermediate RMW uses a relaxed ordering.

My question is, what happens if a std::memory_order_acq_rel ordering is used for an intermediate RMW? Will the thread using acq_rel-RMW synchronize with all prior release-store(origin)/release-RMW in the sequence, without otherwise disrupting or terminating the sequence; and will the final thread performing acquire-load still synchronize with all threads in the sequence that used one of release-store(origin)/release-RMW/acq_rel-RMW? Further, if every intermediate operation was acq_rel-RMW, does that basically imply seq_cst ordering for all threads participating in the release sequence?

Solution

In your first paragraph, with relaxed RMWs: an acquire load will still sync with the release-store that headed the release sequence. But of course not any of the RMW ops because they weren't release ops.

If any of the RMWs are release or acq_rel, an acquire load or RMW will sync with all the release and acq_rel ops because they head their own release sequence as well as forming part of a longer one.

Further, if every intermediate operation was acq_rel-RMW, does that basically imply seq_cst ordering for all threads participating in the release sequence?

In what sense? The behaviour of the whole program still isn't necessarily explainable as some interleaving of program-order if it does any non-seq_cst ops on other shared variables, unless maybe it uses the RMW sequencing to only have one thread at a time doing something.

acq_rel atomic RMWs are pretty strongly ordered and usually the difference between them and seq_cst isn't important, but they are stronger. For example, StoreLoad reordering between the store side of an acq_rel RMW and a later load (even seq_cst) is possible. Just like StoreStore reordering between a seq_cst RMW and a later relaxed store as demonstrated in For purposes of ordering, is atomic read-modify-write one operation or two?

If you mean only in terms of operations on that one variable, that's already guaranteed for relaxed RMWs: the value they load must be the one just before their store in the mod order for that variable, and threads can't disagree on the mod order for a single variable. (And local reordering of accesses to the same variable is forbidden by the coherence rules: https://eel.is/c++draft/intro.races#19)