c++visual-c++x86intrinsicsvisual-c++-2010

How do you use the pause assembly instruction in 64-bit C++ code?


Since inlined assembly is not supported by VC++ 2010 in 64-bit code, how do I get a pause x86-64 instruction into my code? There does not appear to be an intrinsic for this like there is for many other common assembly instructions (e.g., __rdtsc(), __cpuid(), etc...).

On the why side, I want the instruction to help with a busy wait use case, so that the (hyperthreaded) CPU is available to other threads running on said CPU (See: Performance Insights at intel.com). The pause instruction is very helpful for this use case as well as spin-lock implementations, I can't understand why MS did not include it as an intrinsic.

Thanks


Solution

  • Wow, this was a very hard problem to track down, but in case anybody else needs the x86-64 pause instruction:

    The YieldProcessor() macro from windows.h expands to the undocumented _mm_pause intrinsic, which ultimately expands to the pause instruction in 32-bit and 64-bit code.

    This is completely undocumented, by the way, with partial (and incorrect for VC++ 2010 documentation) for YieldProcessor() appearing in MSDN.

    Here is an example of what a block of YieldProcessor() macros compiles into:

        19:     ::YieldProcessor();
    000000013FDB18A0 F3 90                pause  
        20:     ::YieldProcessor();
    000000013FDB18A2 F3 90                pause  
        21:     ::YieldProcessor();
    000000013FDB18A4 F3 90                pause  
        22:     ::YieldProcessor();
    000000013FDB18A6 F3 90                pause  
        23:     ::YieldProcessor();
    000000013FDB18A8 F3 90                pause  
    

    By the way, each pause instruction seems to produce about a 9 cycle delay on the Nehalem architecture, on the average (i.e., 3 ns on a 3.3 GHz CPU).