Consider this program:
-- Initially --
std::atomic<int> x{0};
int y{0};
-- Thread 1 --
y = 1; // A
x.store(1, std::memory_order_release); // B
x.store(3, std::memory_order_relaxed); // C
-- Thread 2 --
if (x.load(std::memory_order_acquire) == 3) // D
print(y); // E
Under the C++11 memory model, if the program prints anything then it prints 1.
In the C++20 memory model, release sequences were changed to exclude writes performed by the same thread. How does that affect this program? Could it now have a data-race and print either 0
or 1
?
This code appears in P0982R1: Weaken Release Sequences which I believe is the paper that resulted in the changes to the definition of release sequences in C++20. In that particular example, there is a third thread making a store to x
which disrupts the release sequence in a way that is counter-intuitive. That motivates the need to weaken the release sequence definition.
From reading the paper my understanding is that with the C++20 changes, C will no longer form part of the release sequence headed by B, because C is not a Read-Modify-Write operation. Therefore C does not synchronize with D. Thus there is no happens-before relation between A and E.
Since B and C are stores to the same atomic variable and all threads must agree on the modification order of that variable, does the C++20 memory model allow us to infer anything about whether A happens-before E?
Your understanding is correct; the program has a data race. The store of 3 does not form part of any release sequence, so D is not synchronized with any release store. There is thus no way to establish any happens-before relationship between any two operations from the two different threads, and in particular, no happens-before between A and E.
I think the only thing you can infer from the load of 3 is that D definitely does not happen before C; if it did, then D would be obliged to load a value that was strictly earlier in the modification order of x
[read-write coherence, intro.races p17]. That means in particular that E does not happen before A.
The modification order would come into play if you were to load from x
again in Thread 2 somewhere after D. Then you would be guaranteed to load the value 3 again. That follows from read-read coherence [intro.races p16]. Your second load is not allowed to observe anything that preceded 3 in the modification order, so it cannot load the values 0 or 1. This would apply even if all the loads and stores in both threads were relaxed.