I know that one has to be very careful when dividing in assembly, i.e. doing this:
mov ah, 10h
mov al, 00h ; dividend = 1000h
mov bl, 10h ; divisor = 10h
div bl ; Integer overflow exception, /result 100h cannot fit into al
I've written some probably non-gotcha-proof logic to create a more friendly environment for division:
mov ah, 10h
mov al, 00h
mov bl, 10h
TryDivide:
cmp bl,ah
jna CatchClause
div bl
clc
jmp TryEnd
CatchClause:
stc
TryEnd:
Does anyone have a clue as to the technical reasons that something like this wasn't implemented and we have exceptions instead of flags set / registers truncated ?
For a definite answer, you'd have to ask Stephen Morse, the designer of the 8086 instruction-set.
Other Intel engineers worked on the actual implementation, but apparently the ISA was designed on paper first, almost entirely by just one guy. He's also credited as principal architect of 8086. PC World interviewed him in 2008, for the 30th anniversary of 8086, and more importantly he wrote a book, The 8086/8088 Primer (1982). I haven't read it, but apparently he discusses some design decisions as well as how to program it. If you're lucky, maybe he wrote something about choosing to have div/idiv trap.
There's no reason it had to be this way; setting CF and/or OF and truncating would have been valid designs. But you still need to choose some value to put in the quotient/remainder output registers in the divide-by-zero case1. (I think it's fairly common for ISAs with HW division to have an exception for at least divide by zero, but On which platforms does integer divide by zero trigger a floating point exception? unfortunately only mentions x86 as an ISA with traps. If division does trap and a POSIX OS delivers a signal at all, it must be SIGFPE for an arithmetic exception.)
Note that other ISAs in practice do make different choices. For example ARM division never faults, and doesn't set flags either. (Although it doesn't provide a double-width dividend, so only the INT_MIN / -1
signed overflow and division by 0 cases are special for it.)
IDK if building a hardware divide unit (or microcode) that could get the correctly-truncated quotient for overflow cases (when the exact quotient is wider than 16-bit) would be harder than simply detecting overflow and bailing out. If so, that would be a fairly good reason.
Leaving garbage in the output registers and setting FLAGS would be possible but not great; every division would need to check the result afterwards if it wanted to avoid the possibility of using garbage. Except ones where the result is known to fit, e.g. because it's unsigned and the high half of the dividend is zero, and the divisor is known to be non-zero.
Leaving some special value in the quotient on overflow (as well as setting FLAGS) is another possibility, like how out-of-range FP to integer conversion produces 0x80000000 for signed i32 (which Intel calls the "integer indefinite" value), or 0xFFFFFFFF for unsigned.
Note 1: div by 0 in some ways is a special case of this: high_half < divisor
is false for divisor=0 for any dividend. But there's no well-defined mathematical result to truncate. IEEE FP division resolves this by treating as the limit as divisor approaches 0, i.e. +- infinity. But integer 0 should be assumed to be exactly 0, not some tiny number, and there's no in-band NaN or Inf value to use anyway, only a finite 0xFFFF...
Note that 8086 only included the one-operand forms of mul
and imul
, which do widening multiply: DX:AX = AX * src
. (CF and OF are set if the high half is non-zero (for mul
), or if the high half isn't the sign-extension of the low half (for imul
)). Only later CPUs introduced truncating forms like imul r, r/m, imm
(186) and imul r, r/m
(386) that don't waste time writing the high half anywhere, although still setting FLAGS so you could detect signed wrapping if you wanted. (Most uses don't, so later CPUs only provided imul, versions of mul that would be the same except for FLAGS.)
add
/sub
can carry/borrow, but the full result of an add is available as CF : reg
with the extra bit in the carry flag.
If you consider sar / shr / shl reg, cl
as a bitwise logical operation, not math, then it doesn't count even though it can shift out multiple bits without leaving them anywhere. (The last bit is left in CF, so shift-by-1 can be undone with a rotate-through-carry.)
That leaves DIV / IDIV as I think the only arithmetic instructions where there could be a wider result and nowhere to put it. That might have been part of the motivation for choosing to have them fault.
high_half < divisor
is gotcha-proof for unsigned divisionThat's the exact condition for the quotient fitting in the operand-size. 1:0
(e.g. 0x0100
for 8-bit operand-size) is smallest quotient that doesn't fit, so 0x0100 * divisor
is the smallest dividend that would produce a quotient that doesn't fit in 8 bits.
That dividend is divisor:0
when split up into hi:lo halves of the same width as the dividend.
Any number smaller than that would have to "borrow" from the high half, leaving it strictly smaller than divisor
.
(Signed division also has the INT_MIN / -1
overflow corner case, and the high-half check might have to be on absolute values.)