Consider the following code:
#include <atomic>
#include <thread>
#include <cassert>
#include <memory>
int i = 0;
std::atomic_int a{0};
int main()
{
std::thread thr1{[]
{
i = 1; // A
a.store(1, std::memory_order::release); // B
}};
std::thread thr2{[]
{
while (a.load(std::memory_order::relaxed) != 1); // C
a.store(2, std::memory_order::release); // D
}};
std::thread thr3{[]
{
while (a.load(std::memory_order::acquire) != 2); // E
assert(i == 1); // F
}};
thr1.join();
thr2.join();
thr3.join();
}
My assumption is that the assert may or may not fail and the behavior here is undefined.
Although we have the happens-before relationships such as A->B, C->D, D->E, E->F, we don't have such a relationship for B->C because of the relaxed load in C.
On the other hand, https://en.cppreference.com/w/cpp/atomic/memory_order says that
All memory writes (including non-atomic and relaxed atomic) that happened-before the atomic store from the point of view of thread A, become visible side-effects in thread B. That is, once the atomic load is completed, thread B is guaranteed to see everything thread A wrote to memory. This promise only holds if B actually returns the value that A stored, or a value from later in the release sequence.
But we can't say that B->D is a release sequence headed by B because there are no read-modify-write operations on a
whatsoever, so this paragraph doesn't work here.
Am I right in my understanding?
You are correct. Either the assertion succeeds, or the behavior is undefined (though in practice, that means the assertion should fail).
There is an intuitive way to explain it and a formal way.
In an acquire/release memory model, every thread views the memory changes done by other threads as a timeline, and observes some point on that timeline. Threads can disagree over the value that objects hold because they each see their own time point for other threads' modification timeline.
While thr3
can only reach assert(i == 1);
once thr2
has completed all its work due to the busy-wait in thr3
, at the point of the assertion, thr3
can still have an outdated view of thr1
(where i
is set).
This outdated view is older than the one thr2
has.
Note that thr1
has to complete all of its work before thr2
can do its work, and thr2
has to complete all of its work before thr3
can do its work.
However, this restriction on the sequence of events doesn't restrict what memory operations become visible between threads.
Even if thr2
did a.load(std::memory_order::acquire)
, this wouldn't impose that thr3
has to have a more recent view of thr1
.
i = 1;
and assert(i == 1);
are possibly a data race.
Either the assertion succeeds, or the behavior is undefined.
The problem is that it is unspecified whether a.store(1, std::memory_order::release);
synchronizes with a.load(std::memory_order::acquire)
.
The condition for this is found in [atomics.order] p2
An atomic operation A that performs a release operation on an atomic object M synchronizes with an atomic operation B that performs an acquire operation on M and takes its value from any side effect in the release sequence headed by A.
If the load
takes its value from the store(1)
at some point, then a happens before relationship is formed because i = 1
is sequenced before that store(1)
and store(1)
synchronizes with load()
in thr3
.
However, there is no guarantee that this happens ([intro.races] p14):
The value of an atomic object M, as determined by evaluation B, is the value stored by some unspecified side effect A that modifies M, where B does not happen before A.
A plausible sequence of events is that thr2
completes all of its work before thr3
has started.
thr3
then loads its value for a
from the side effect a.store(2, std::memory_order::release); // D
, so it never takes its value from the store
in thr1
.
In that event, you have a data race (between i = 1
and i == 1
), and the behavior is undefined.