Given this pseudocode, where there is globally an atomic int a
initialized to 0:
Thread 1:
// ... some code here (X) ...
a.store(1, relaxed);
futex_wake(&a);
Thread 2:
if (futex_wait(&a, 1) == woken_up) {
assert(a.load(relaxed) == 1);
// ... some code here (Y) ...
}
Ignoring the possibility of spurious wakeups; can we deduce from the code above that the code X
synchronizes-with Y
? Basically, it boils down to whether futex itself is meant to achieve acquire/release semantics across a wait that is woken up.
A bit of context: TSAN does not understand the futex system call (e.g. see here, here).
Now, normally, when using futex in order to implement a mutex, semaphore or some other synchronization primitive, one also has an atomic variable that gets loaded with acquire ordering by the "locking" side, and stored in release ordering by the "unlocking" side. (Above, I'm deliberately using relaxed semantics instead.)
That acquire/release is enough to achieve synchronization, be formally correct, and it's recognized by TSAN (which does not report anything for locks implemented this way, e.g. QBasicMutex in Qt).
This question is mostly about the suggestion offered in the forum post linked above to mark futex operations themselves with acquire/release semantics. Would such tagging be correct?
(I know that the C++ abstract machine does not know anything about futex
. It does not even know anything about pthreads, but TSAN does, and knows that e.g. code that happens before a pthread_create
also happens before the code running in the newly create thread. In other words, this is not a language lawyer question...)
From man futex(2)
:
The loading of the futex word's value, the comparison of that value with the expected value, and the actual blocking will happen atomically and will be totally ordered with respect to concurrent operations performed by other threads on the same futex word. Thus, the futex word is used to connect the synchronization in user space with the implementation of blocking by the kernel. Analogously to an atomic compare-and-exchange operation that potentially changes shared memory, blocking via a futex is an atomic compare-and-block operation.
That total ordering corresponds to C++ std::memory_order_seq_cst
:
A load operation with this memory order performs an acquire operation, a store performs a release operation, and read-modify-write performs both an acquire operation and a release operation, plus a single total order exists in which all threads observe all modifications in the same order.
In other words, futex
syscall does in the kernel the equivalent of C++11:
a.compare_exchange_strong(..., std::memory_order_seq_cst, std::memory_order_seq_cst);