This snip is from Herb Sutter's Atomic Weapons talk slide from page number 19.
If I am understanding this correctly, what Herb is saying is that, for the assert()
in thread 3 to succeed, this has to follow sequential consistency. So the following code will not fail the assertion.
int g{0}; // normal int
std::atomic<int> x{0}, y{0}; // atomics
void thread1() {
g = 1;
x.store(1, std::memory_order_seq_cst);
}
void thread2() {
if(x.load(std::memory_order_seq_cst) == 1)
y.store(1, std::memory_order_seq_cst);
}
void thread3() {
if(y.load(std::memory_order_seq_cst) == 1)
assert( g == 1 );
}
But wouldn't this also work if release/acquire was used instead as follows?
int g{0}; // normal int
std::atomic<int> x{0}, y{0}; // atomics
void thread1() {
g = 1; // A
x.store(1, std::memory_order_release);
}
void thread2() {
if(x.load(std::memory_order_acquire) == 1)
y.store(1, std::memory_order_release);
}
void thread3() {
if(y.load(std::memory_order_acquire) == 1)
assert( g == 1 ); // B
}
Q1 - Doesn't //A
simply-happens-before //B
ensure that the assertion is valid, since no other thread writes to g
?
Q2 - Am I misunderstanding the purport of the slide, or is something wrong on the slide?
The example is important, but you're right the heading and discussion should be updated now that C++11 "acq/rel" meanings are well established.
This example can't affected by "ordinary acq/rel" vs "SC acq/rel" because the key difference between the two is that the former allows store-load reordering and the latter disallows it, as I covered a few slides earlier. This example has all the other three combinations, with one thread each for store-store, load-store, and load-load; but this example doesn't have store-load and so it's immune to the difference between those two acq/rel flavors.
The main point I was aiming for in this slide (whose second half covers a total-store-order example) is that for a memory model to be usable it can't just talk about individual threads' operations in isolation and sufficiently cover all cases. To have a consistent memory model a programmer can reason about, each thread's operations also have to work in an SC manner both transitively (when daisy-chained together like this) and globally (for a total store order, which was the other example on this slide), otherwise we can violate causality and humans can't write a coherent program.
I think the reason I mentioned "acq/rel" in the slide title is that during the development of the C++ MM there were many proposals for more relaxed rules than we standardized, including ideas about just letting individual pairs of acq/rel be enough without a transitive property, and those models are too relaxed to be useful. But I agree that since now "acq/rel" are now well-established to mean the m_o_acq
and m_o_rel
in the standard I should stick to those meanings of "acq/rel" to avoid confusion (they were still very new at the time I gave this talk and were still being implemented by compilers, but this was 2012 and I still should not have made them the issue here).
I've updated the slide title for the next time I give the talk. Thanks!