The answer to this question, Acqrel memory order with 3 threads, states that the thread performing the final acquire-load will synchronize with each thread in a release sequence if the intermediate RMW uses a release ordering. And it will not synchronize with any threads where the intermediate RMW uses a relaxed ordering.
My question is, what happens if a std::memory_order_acq_rel
ordering is used for an intermediate RMW? Will the thread using acq_rel-RMW synchronize with all prior release-store(origin)/release-RMW in the sequence, without otherwise disrupting or terminating the sequence; and will the final thread performing acquire-load still synchronize with all threads in the sequence that used one of release-store(origin)/release-RMW/acq_rel-RMW? Further, if every intermediate operation was acq_rel-RMW, does that basically imply seq_cst ordering for all threads participating in the release sequence?
In your first paragraph, with relaxed RMWs: an acquire load will still sync with the release-store that headed the release sequence. But of course not any of the RMW ops because they weren't release ops.
If any of the RMWs are release
or acq_rel
, an acquire load or RMW will sync with all the release
and acq_rel
ops because they head their own release sequence as well as forming part of a longer one.
Further, if every intermediate operation was
acq_rel
-RMW, does that basically implyseq_cst
ordering for all threads participating in the release sequence?
In what sense? The behaviour of the whole program still isn't necessarily explainable as some interleaving of program-order if it does any non-seq_cst
ops on other shared variables, unless maybe it uses the RMW sequencing to only have one thread at a time doing something.
acq_rel
atomic RMWs are pretty strongly ordered and usually the difference between them and seq_cst
isn't important, but they are stronger. For example, StoreLoad reordering between the store side of an acq_rel
RMW and a later load (even seq_cst
) is possible. Just like StoreStore reordering between a seq_cst
RMW and a later relaxed
store as demonstrated in For purposes of ordering, is atomic read-modify-write one operation or two?
If you mean only in terms of operations on that one variable, that's already guaranteed for relaxed
RMWs: the value they load must be the one just before their store in the mod order for that variable, and threads can't disagree on the mod order for a single variable. (And local reordering of accesses to the same variable is forbidden by the coherence rules: https://eel.is/c++draft/intro.races#19)