c++language-lawyerc++20memory-barriersstdatomic

How does libcxx std::counting_semaphore implement "Strongly happens before" for release / acquire?


libc++ std::counting_semaphore uses atomic increment with memory_order_release in release method:

    void release(ptrdiff_t __update = 1)
    {
        if(0 < __a.fetch_add(__update, memory_order_release))
            ;
        else if(__update > 1)
            __a.notify_all();
        else
            __a.notify_one();
    }

And compare exchange with memory_order_acquire successful memory order in acquire method:

    void acquire()
    {
        auto const __test_fn = [=]() -> bool {
            auto __old = __a.load(memory_order_relaxed);
            return (__old != 0) && __a.compare_exchange_strong(__old, __old - 1, memory_order_acquire, memory_order_relaxed);
        };
        __cxx_atomic_wait(&__a.__a_, __test_fn);
    }

Obvious choice to make acquire synchronized with release.


However, C++20 draft says:

void release(ptrdiff_t update = 1);

...

Synchronization: Strongly happens before invocations of try_­acquire that observe the result of the effects.

Strongly happens before is somewhat more than synchronizes with, C++20 draft says:

An evaluation A strongly happens before an evaluation D if, either

  • (12.1) A is sequenced before D, or
  • (12.2) A synchronizes with D, and both A and D are sequentially consistent atomic operations ([atomics.order]), or
  • (12.3) there are evaluations B and C such that A is sequenced before B, B simply happens before C, and C is sequenced before D, or
  • (12.4) there is an evaluation B such that A strongly happens before B, and B strongly happens before D.

I guess 12.2 would have been the best fit here, where A is fetch_add, D is compare_exchange_strong. But in addition to being synchronized with, they should have been seq_cst!

Trying 12.3 doesn't seem to help either. We call fetch_add B, and compare_exchange_strong C. Fine, but where are A and D then?

So how does it work (according to the C++20 Standard draft)?


The same applies to std::latch and std::barrier.

I picked one (std::semaphore) to refer easily specific paragraphs and lines.


Solution

  • As referenced in P0668R5:

    [...] Normally the strongly-happens-before guarantee will be used to order the users code before and after the calls, meaning that the requirement is implicitly satisfied by the addition of user code. [...]

    This appears to be aimed at (12.3):

    • there are evaluations B and C such that A is sequenced before B, B simply happens before C, and C is sequenced before D, or

    In other words, A and D represent user code, B corresponds to release(), and C signifies acquire(). Given that B simply happens before C (due to the release-acquire synchronization mentioned in your question), we establish that A strongly happens before D, which achieves the desired ordering. However, in line with this rationale, it appears that we should incorporate the simply-happens-before guarantee in the synchronization phrasing of release().

    Alternatively, we could integrate acquire/release operations into the strongly-happens-before relation:

    • A synchronizes with D, A is a release atomic operation, and D is an acquire atomic operation ([atomics.order]), or

    This approach seems more logical because the original purpose of strongly-happens-before is to resolve the ordering issue with mixed seq_cst and acquire/release operations. However, acquire/release operations alone are unaffected by this issue, as articulated in the paper:

    [...] Code that consistently accesses each location only with memory_order_seq_cst, or only with weaker ordering, is not affected and functions correctly.