c++c++11assemblycpu-architectureatomic

minimum required atomic instructions to support C++11 concurrency libraries


I'm implementing a multi core system consisting of several custom/specialty CPUs. Those CPUs need to be able to support the C++11 concurrency libraries (thread/mutex etc.).

I'm not sure what kind of atomic instructions the CPUs needs to implement at a minimum to support the C++11 concurrency features. I'm reading up on different cpus (x86/risc-v etc), but it's not clear to me what instructions are actually essential and which ones are just nice to have for performance gain. For example Risc-v does not implement a CAS (compare-and-swap). So it seems to me this is a non-essential instruction.

I was wondering if somebody with knowledge of low-level multi-core systems could answer my question. Thank you


Solution

  • CAS and LL/SC are both general building blocks; any atomic RMW operation can be synthesized from either them. You need one or the other for lock-free std::atomic<int>. Some ISAs have both (notably ARMv8.1), most have one or the other; most older RISCs like PowerPC having just LL/SC. x86-64 has only CAS. Direct support for specific operations like .fetch_add can avoid retry loops for those operations, but that's not really qualitatively different. See


    std::atomic_flag is required to be lock-free, and can be used to implement locking, which allows non-lock-free std::atomic<T>. So the true minimum required to implement C++11 at all is just the primitive things early ISAs had which were sufficient for locking but aren't building blocks for arbitrary lock-free atomic RMW operations the way LL/SC and CAS are.

    For it, you only need a minimum of TAS (test-and-set) or something which can be used as a TAS, like exchange. And atomic byte or word stores for .clear(), and control of memory ordering via ISA guarantees, fence instructions, or ordering semantics baked into atomic RMW / load / store instructions. (And in C++20, atomic byte or word loads for .test(). Before that, ISAs didn't need to support seq_cst pure loads.)

    See Why does C/C++ not have a atomic flag test_and_clear? for more discussion of what some old ISAs have and how they work as building blocks, like m68k actually having an instruction called tas. (cas only in 68020 and later.) And several ISAs having an atomic RMW swap/exchange instruction.

    For ordering support, it's always fine to be stronger than the standard requires, even if that means promoting all acquire and release ops to seq_cst. But it's not fine if an ISA is unable to recover sequential consistency. CPU architects know this, and provide fence instructions if their ISA allows runtime memory reordering in the first place.

    (Unless you care a lot about retro hardware, just use std::atomic<bool> like a normal person.)


    <thread>

    No special instruction, but a timer interrupt of some sort and an ISA with interrupt handling is pretty necessary for pre-emptive multi-tasking. Without that, a compiler would have to insert yield() calls at a lot of points to make loops non-infinite and ensure that std::atomic loads see stores in finite time, among other things.


    <mutex>

    Mutual exclusion without an atomic RMW is possible with just sequentially consistent load/store (Dekker's or Peterson's algorithms, or a few others), but those require an array of flags or counters with an entry for each thread. Not really compatible with being able to start new std::threads at any time according to program logic, after std::mutex objects already exist and are already in use.

    std::atomic_flag is sufficient for mutual exclusion with a simple spinlock. I'm not sure if that's sufficient for a proper implementation of something like Linux futex OS-assisted sleep-while-value-unchanged like C++20 std::atomic .wait() and .notify_one() / .notify_all(). (The OS needs to atomically check-and-sleep to avoid putting a core to sleep or context-switching after a notify has already arrived or after a value-change has become visible. Or at least not to lose notifications.)