c++multithreadingvisual-c++volatilememory-fences

Does MS-specific volatile prevent hardware instructions reordering


From the documentation:

Microsoft Specific

When the /volatile:ms compiler option is used—by default when architectures other than ARM are targeted—the compiler generates extra code to maintain ordering among references to volatile objects in addition to maintaining ordering to references to other global objects.In particular:

  • A write to a volatile object (also known as volatile write) has Release semantics; that is, a reference to a global or static object
    that occurs before a write to a volatile object in the instruction
    sequence will occur before that volatile write in the compiled
    binary.
  • A read of a volatile object (also known as volatile read) has Acquire semantics; that is, a reference to a global or static object
    that occurs after a read of volatile memory in the instruction
    sequence will occur after that volatile read in the compiled binary.

This enables volatile objects to be used for memory locks and releases in multithreaded applications.

It surely guarantees that volatile prevents compiler from doing compile-time instructions reordering (because it explicitly states that the instruction sequence will be the same in the compiled binary).

But as we all know, there's also such thing like hardware reordering (like CPU being able to reorder instructions on their own will). Does volatile prevents it as well? I know that synchronization primitives (such as mutexes) do, but what about MS-specific volatile?


Solution

  • The MSDN docs on the MS-specific volatile behavior go all the way back to VS2003. So it's been around for a while—long before the existence of std::atomic in C++11.

    So MS-specific volatile seems to be the way that one achieved acquire/release semantics in the old days. But now it is basically obsolete and they left a footnote nudging you away from MS-volatile in favor std::atomic and /volatile:iso for inter-thread communication.


    As for why they exclude ARM, Microsoft didn't pick up ARM until relatively recently. Other than ARM, they support x86, x64, and Itanium (which is dead).

    On x86 and x64, most loads and stores already have acquire/release semantics (with exceptions such as non-temporal stores). So as long as the compiler doesn't reorder anything, the processor won't either* and will therefore preserve the acquire/release semantics. The /volatile:ms flag tells the compiler not to reorder anything so that acquire/release semantics can be achieved on x86 and x64.

    Since Microsoft's ARM support is relatively new, and MS-specific volatile (/volatile:ms) is outdated in favor of std::atomic, they probably decided to abandon the classic volatile semantics rather than updating them to work on ARM as well (which would probably mean adding memory barriers everywhere, given the lack of hardware support).

    *The processor will still do whatever reordering it wants, but it will preserve the acquire/release semantics of the program since that's required by x86/x64. (minus the exceptional cases like nt-stores or clflush) How it does this without violating memory-ordering is a different topic.