assemblyx86micro-optimizationzero-initialization

Why does some Windows booloader code zero registers with `sub` instead of `xor`?


Given considerations such as detailed in https://stackoverflow.com/a/33668295, it seems xor reg, reg is the best way to zero a register. But when I examine real-world assembly code (such as Windows bootloader code, IIRC), I see both xor reg, reg and sub reg, reg used.

Why is sub used at all for this purpose? Are there any reasons to prefer sub in some special cases? For example, does it set flags differently from xor?


Solution

  • Differences:

    I'd guess it's just different authors of hand-written asm, some of them preferring sub probably without realizing that some CPUs only special-case xor. Except in cases where they want to guarantee clearing the AF flag, where sub might be intentional. Like perhaps initializing things and wanting a fully known state for EFLAGS before something that might use pushf.

    XOR leaving AF undefined still means it will be either 0 or 1, you just don't know which. (Not like C undefined behaviour). The actual result could depend on the CPU model, the input values, or possibly even some stray bits somewhere.

    In modern CPUs that recognize sub as a zeroing idiom, it will be zero so the CPU can handle xor-zeroing and sub-zeroing exactly identically, including the FLAGS result.