assemblyx86binarymachine-codeintel-tsx

Assembly x86 REP, REPZ, REPNZ, XACQUIRE and XRELEASE instructions


As I have noticed the 0xF3 binary prefix is used as:
1) repeat and decrease ecx until ecx equal to 0 in the INS,OUTS,MOVS,LODS,STOS instructions and called rep
2) repeat and decrease ecx until ecx equal to 0 or ZF set in the CMPS,SCAS instructions and called repz or repe

The 0xF3 binary prefix is used as:
1) repeat and decrease ecx until ecx equal to 0 or ZF NOT set in the CMPS,SCAS instructions and called repnz or repne

Recently noticed that XACQUIRE/XRELEASE prefixes also have the same binary values (0xF2,0xF3)

So what XACQUIRE/XRELEASE is doing (I read something about locking an memory address but they are not work line lock (I believe))?

Also what 0xF3 mov byte ptr [ecx],0x0 will do? (will stop at ZF set\not set or it will stop only at ecx equal to 0)
And what 0xF2 mov byte ptr [ecx],0x0 will do?


Solution

  • Quoting Intel Software Developer Manual 2, Section 2.1.1

    Use these prefixes only with string and I/O instructions (MOVS, CMPS, SCAS, LODS, STOS, INS, and OUTS). Use of repeat prefixes and/or undefined opcodes with other Intel 64 or IA-32 instructions is reserved; such use may cause unpredictable behavior.
    Some instructions may use F2H,F3H as a mandatory prefix to express distinct functionality.

    The use of repeat prefixes with non-string non-IO instructions is undefined behaviour exactly for the reason you just discovered: Intel reuses it for expressing different flavours of the same "instruction" or to implement new extensions.

    In the case of the HLE instructions (like xacquire), they are only valid for a specific set of instructions.
    For example, xacquire can be used only with ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCHG8B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG - these instructions don't allow repeating prefixes, so no ambiguity arises.

    In general, non-relevant prefixes are ignored, so while adding a prefix to an instruction may cause undefined behaviour in future processors, it is safely ignored in older processors.

    That's why support for HLE don't need to be checked explicitly:

    Hardware without HLE support will ignore the XACQUIRE and XRELEASE prefix hints and will not perform any elision since these prefixes correspond to the REPNE/REPE IA-32 prefixes which are ignored on the instructions where XACQUIRE and XRELEASE are valid.

    An instruction like 0xF3 mov byte ptr [ecx],0x0 will do as mov byte ptr [ecx],0x0, as today, because the prefix is ignored.

    To reiterate it explicitly: repeat prefixes are used to choose a different semantics for an instruction.

    Sometimes the instruction has an explicit name and the alternative semantics are close together (e.g. movs, repe movs, repne movs or the fact that tzcnt is 0xf3 bsf) and sometimes the instruction doesn't have an explicit name and the alternatives are less obvious (e.g. mulsd is 0xf2 mulps, mulss is 0xf3 mulps, mulpd is 0x66 mulps).

    More information on the xacquire instruction can be found in the Intel Software Developer Manuals or in this post.