Since −x = not(x)+1 which then implies a-b = a+not(b)+1, would then
sub rax, rcx
be equivalent to
mov temp, rcx
not temp
add rax, temp
add rax, 1
where temp is some register considered to be volatile?
In other words, does the latter affect EFLAGS in the exact same way? If not, how can it be forced to?
Yes, that gets the same integer result in RAX.
In other words, does the latter affect EFLAGS in the exact same way?
Of course not. ZF, SF, and PF only depend on the integer result, but CF and OF1 depend on how you get there. x86's CF carry flag is a borrow output from subtraction. (Unlike some ISAs such as ARM, where subtraction sets the carry flag if there was no borrow.)
Trivial counterexample you could check in your head:
0 - 1
with sub
sets CF=1. But your way clears CF.
mov temp, rcx # no effect on FLAGS
not temp # no effect on FLAGS, unlike most other x86 ALU instructions
add rax, ~1 = 0xFF..FE # 0 + anything clears CF
add rax, 1 # 0xFE + 1 = 0xFF..FF = -1. clears CF
(Fun fact: not
doesn't affect FLAGS, unlike most other ALU instructions including neg
. neg
sets flags the same as sub
from 0
. A strange quirk of x86 history. https://www.felixcloutier.com/x86/not#flags-affected)
Footnote 1: so does AF, the half-carry flag (auxiliary) from the low to high nibble in the low byte. You can't branch on it directly, and x86-64 removed the BCD instructions like aaa
that read it, but it's still there in RFLAGS where you can read it with pushf
/ pop rax
for example.
If not, how can it be forced to?
Use different instructions. The easiest and most efficient way to get the desired effect on EFLAGS would be to optimize it back to sub rax, rcx
. That's why x86 has sub
and sbb
instructions. If that's what you want, use it.
If you want an alternative, you definitely need to avoid something like add rax,1
as the last step. That would set CF only if the final result is zero, wrapping from ULONG_MAX = -1.
Doing x -= y
as x += -y
works for OF in most cases. (But not the most-negative number y=LONG_MIN
(1UL<<63
), where neg rcx
would overflow).
But CF tells you about the 65-bit full result of 64 + 64-bit addition or subtraction. 64-bit negation isn't sufficient: x += -y
doesn't always set CF opposite of what x -= y
would.
Possibly something involving neg
/ sbb
could be useful? But no, that treats carry-out from negation as -0 / -1, not -(1<<64)
.
# Broken attempt that fails for CF when rcx=0 at least, probably many more cases.
# Also fails for OF for rcx=0x8000000000000000 = LONG_MIN
mov temp, rcx # no effect on FLAGS
neg temp # or NOT + INC if you insist on avoiding sub-like operations
add rax, temp # x += -y
cmc # complement carry. CF = !CF
Notice that we combine x and y in a single step. Your add rax, 1
at the end steps on the earlier CF result, making it even less likely / possible for CF to be what you want.
Signed-overflow (OF) has a corner case. It would be the same for most inputs, where the signed arithmetic operation is the same for x -= y
or x += -y
. But if -y
overflows to still be negative (the most-negative 2's complement number has no inverse), it's adding a negative instead of subtracting a negative.
e.g. -LONG_MIN == LONG_MIN
because of signed overflow. (C notation; signed overflow is UB in ISO C, but in asm it wraps).
Counterexample for this attempt for CF:
-1 - 0
doesn't borrow, so CF=0.
-1 + -0
= -1 + 0
doesn't carry either, and then CMC will flip CF to 1
But -1
(0xff...ff
) plus any other number does carry-out, while -1
minus any number doesn't.
So it's not easy, and probably not very interesting to emulate the borrow output of sub
accurately.
Note that hardware ALUs often use something like a binary Adder–subtractor that muxes A
or ~A
as an input to full-adders in a carry/borrow aware way to implement A + B
or A - B
with a correct borrow output for subtraction.
It should be possible to use stc
/ adc dst, inverted_src
in asm to replicate what hardware like that actually does: addition of the inverse with a carry-in of 1. Not separately adding 1.
(TODO: rewrite more of this answer to show using not
/ stc
/ adc
instead of multiple operations that potentially need to propagate carry all the way through the number).
Related: