assemblyx86cpu-architecturecpu-registerseflags

Can the status register influence data storage in a CPU?


When you perform an arithmetic operation in or and say, it results in an overflow flag being set (or any other flag), can the bits being set affect in which register the resulting value from the operation be stored? Or can this only happen for branch instructions?

I tried to test this myself by creating an assembly file, converting it into an executable and analyzing it with GDB, but for some reason it always shows me the wrong architecture. So I decided to come here directly to see if I can get an answer.


Solution

  • There are no x86 instructions where bits from EFLAGS form part of the register-number that encodes the destination.

    Register destinations are strictly from the machine code, so the out-of-order machinery always knows which register(s) are written by an instruction just by decoding it.

    (If such an instruction existed, it would have to serialize the pipeline so register-renaming had outputs of previous instructions available. Similar to how switching modes (16 vs. 32 vs. 64) has to serialize so the same machine code decodes differently.)


    Instructions like cmovcc eax, [rdi] unconditionally read [RDI], EAX, and EFLAGS, and unconditionally write EAX (with a value produced by an ALU, selected from one of the two integer inputs according to the FLAGS condition). So it will fault on a bad address regardless of the condition being true or false. And there's a data dependency of the output on all three inputs. (CPUs handle it like an adc instruction as far as tracking dependencies, not like a branch around a mov.)

    The upcoming APX extension adds fault-suppressing CMOV load and store forms, which you can think of as being like ARM predicated loads or stores that are like NOP if the condition is false. But that's still just writing a register or not, no possibility of writing a different register depending on data.

    Similarly, AVX-512 masked instructions like vaddps xmm0{k1}, xmm1, xmm2 will leave xmm0 unmodified if the mask is all zeros, but that's not different from CMOV.

    There's no way to index a different architectural register according to data in other registers. That would be very inconvenient for modern out-of-order exec CPUs with register renaming, and no legacy x86 instructions do that either. (Model-Specific Registers (MSRs) are read/written by rdmsr / wrmsr with an index in ECX, but those are serializing instructions already and trigger microcode to poke at the internals; unlike the integer and vector registers, the MSRs aren't renamed.)

    Some CPU architecture background on how modern CPUs work which might help understand why data-dependent register indexing would be a disaster. Keep in mind that the Register Allocation Table has to update with 4 to 6 renames every cycle depending on pipeline width (when the front-end isn't stalled), and having future instructions available to find independent work in them is key to OoO exec finding instruction-level parallelism (ILP).


    Or can this only happen for branch instructions?

    Branch instructions only write RIP, never a different register, so no, they aren't an example of what you're talking about. (Indirect call reg/mem also writes RSP and memory, but again not dependent on FLAGS.)