assemblyx86cpu-registerscarryflageflags

x86 sbb with same register as first and second operand


I am analyzing a sequence of x86 instructions, and become confused with the following code:

135328495: sbb edx, edx
135328497: neg edx
135328499: test edx, edx
135328503: jz 0x810f31c

I understand that sbb equals to des = des - (src + CF), in other words, the first instruction somehow put -CF into edx. Then it negtive -CF into CF, and test whether CF equals to zero??

But note that jz checks flag ZF, not CF! So basically what is the above code sequence trying to do? This is a legal x86 instruction sequence, produced by g++ version 4.6.3.

The C++ code is actually from the botan project. You can find the overall assembly code (the Botan RSA decryption example) at here. There are quite a lot of such instruction sequence in the disassembled code.


Solution

  • sbb edx, edx
    

    Your analysis of this instruction is correct. SBB means "subtract with borrow". It subtracts the source from the destination in a way that takes the carry flag (CF) into account.

    As such, it is equivalent to dst = dst - (src + CF), so this is edx = edx - (edx + CF), or simply edx = -CF.

    Don't let it fool you that the source and destination operands are both edx here! SBB same, same is a pretty common idiom in compiler-generated code to isolate the carry flag (CF), especially when they are attempting to generate branchless code. There are alternative ways of doing this, namely the SETC instruction, which is probably faster on most x86 architectures (see comments for a more thorough dissection), but not by a significant amount. Compilers from different vendors (and possibly even different versions) tend to have a preference for one or the other, and use that everywhere, when you're not doing architecture-specific tuning.

    neg edx
    

    Again, your analysis of this instruction is correct. It's a pretty simple one. NEG performs a two's-complement negation on its operand. Therefore, this is just edx = -edx.

    In this case, we know that edx originally contained -CF, which means that its initial value was either 0 or -1 (because CF is always either 0 or 1, on or off). Negating it means that edx now contains either 0 or 1.

    That is, if CF was originally set, edx will now contain 1; otherwise, it will contain 0. This is really the completion of the idiom discussed above; you need the NEG to fully isolate the value of CF.

    test edx, edx
    

    The TEST instruction is the same as the AND instruction, except that it does not affect the destination operand—it only sets flags.

    But this is another special case. TEST same, same is a standard idiom in optimized code to efficiently determine if the value in a register is 0. You could write CMP edx, 0, which is what a human programmer would naïvely do, but test is faster. (Why does this work? Because of the truth table for bitwise AND. The only case where value & value == 0 is when value is 0.)

    So this has the effect of setting flags. Specifically, it sets the zero flag (ZF) if edx is 0, and clears it if edx is non-zero.

    Therefore, if CF was originally set, ZF will now be clear; otherwise, it will be set. Perhaps a simpler way of looking at it is that these three instructions set ZF to the opposite of the original value of CF.

    Here are the two possible data flows:

    jz 0x810f31c
    

    Finally, this is a conditional jump based on the value of ZF. If ZF is set, it jumps to 0x810f31c; otherwise, it falls through to the next instruction.

    Putting everything together, then, this code tests the complement of the carry flag (CF) via an indirect route that involves the zero flag (ZF). It branches if the carry flag was originally clear, and falls through if the carry flag was originally set.

    That's how it works. That said, I cannot explain why the compiler chose to generate the code this way. It appears to be sub-optimal on a number of levels. Most obviously, the compiler could have simply emitted a JNC instruction (jump if not carry). Although Peter Cordes and I have made various other observations and speculations in comments, I don't think it makes sense to incorporate all of this into an answer unless a bit more information can be provided about the origin of this code.