assemblyx86virtualboxx86-16interrupt-handling

x86 division exception-return address


When trying to write some routine in x86 assembly for a boot loader, I came across a bug where when a division error happened, the program would get stuck in an infinite loop. Through investigating, I found out that calling int 0 would go through the exception handler normally and then continue execution of the rest of the program. Writing my own exception handler for x86, the return address when a division error exception happened was the address of the instruction, meaning that it would just execute the division over and over looping forever. Is this normal behavior or a bug with Virtualbox/my cpu specifically?

org 0x7c00      ;put all label addresses at offset 0x7c00

xor ax, ax      ;set up all segment registers
mov ds, ax
mov ax, 0x9000
mov ss, ax
mov sp, 0x1000
mov ax, 0xB800  ;video text memory starts at this address
mov es, ax

mov ah, 0x00
mov al, 0x02
int 0x10        ;go into 80x25 monochrome text

mov [0x0000], word DivideException
mov [0x0002], word 0x0000

xor di, di
xor bx, bx

;int 0   ;this and the divide CX below will cause a division error exception

mov ax, 0
mov cx, 0 ;when exception is handled it prints out
div cx    ;"a divide by zero error happened 0000:7C2D 0000:7C2F
          ;the first address is the division instruction and the second one is 2 bytes after
          ;when int 0 is uncommented out then it will have the two same addresses
jmp $

ToHex:
push bp
mov bp, sp
push bx

mov ax, word [bp+6]
mov bx, word [bp+4]
add bx, 3
mov cx, 16

.Loop:
xor dx, dx
div cx
add dx, 48
cmp dx, 58
jb .Skip
add dx, 7
.Skip:
mov byte [bx], dl
dec bx
cmp ax, 0
jne .Loop

.Ret:
pop bx
mov sp, bp
pop bp
ret

PrintStr:
push bp
mov bp, sp
push bx

mov bx, word [bp+6]
mov ah, byte [bx]
mov bx, word [bp+4]

.PrintLoop:
mov al, byte [bx]

mov word [es:di], ax
inc di
inc di
inc bx
cmp byte [bx], 0x00
jne .PrintLoop

pop bx
mov sp, bp
pop bp
ret

DivideException:
push bp
mov bp, sp
push bx

push word ColorAttributes1
push word String3
call PrintStr
add sp, 4

push word [bp+4]
push word String1
call ToHex
add sp, 4

push word [bp+2]
push word String2
call ToHex
add sp, 4

push word ColorAttributes1
push word String1
call PrintStr

push ds
mov ds, word [bp+4]
mov bx, word [bp+2]

cmp byte [ds:bx], 0xF7  ;checks if theres a 0xF7 byte at the return address
jne .DontAdd            ;for some reason the return address when calling int 0
add word [bp+2], 2      ;directly is the address after the instruction while
.DontAdd:               ;causing a divide error exception through divsion will
pop ds                  ;put the return address at the division leading to an
                        ;infinite loop
push word [bp+4]
push word String1
call ToHex
add sp, 4

push word [bp+2]
push word String2
call ToHex
add sp, 4

push word ColorAttributes1
push word String1
call PrintStr

add sp, 4

pop bx
mov sp, bp
pop bp
iret



String1: db "0000:";, 0x00
String2: db "0000 ", 0x00
String3: db "a divide by zero error happened ", 0x00
ColorAttributes1: db 0x0F ; first nibble is backround color
                         ;second nibble is foreground


times 2048-2- ($-$$) db 0  ;fills the rest with 0's until 510 bytes
dw 0xAA55               ;magic boot sector number

Solution

  • Original 8086/8088 does push the address of the following instruction for #DE exceptions.
    But all other x86 CPUs push the start address of the faulting div/idiv instruction. (At least starting from 386; but 286 is very likely the same as 386.)

    That's normal for x86 in general: faulting instructions push the address of the instruction that faulted. x86 machine code can't be reliably/unambiguously decoded backwards, so the design intent is that the exception handler can examine the situation and potentially repair it, and re-run the faulting instruction.

    See Intel x86 - Interrupt Service Routine responsibility which breaks down the differences between Faults, Traps, and Aborts, and even specifically mentions the difference between int 0 and a faulting div.

    That's useful for #PF page faults, although not as realistic for things like FP and integer arithmetic exceptions. But if not repair, then at least report the actual instruction that faulted. e.g. idiv dword [fs: rdi + 0xf1f7f1f7] would be ambiguous to disassemble backwards. The f7 f1 bytes in the disp32 are the encoding for div ecx. You also wouldn't know if a jump had jumped straight to the idiv opcode after the FS prefix. So it's definitely useful for debugging and possibly other purposes to have the actual address of the start of the faulting instruction, not its end.

    int 0 (if allowed by the IDT if you're not in real mode) pushes the CS:[ER]IP of the following instruction, of course, since it's not something that could re-run without faulting after the situation is repaired. int in general is intended to work kind of like call in terms of returning to the instruction after.


    The 8086 behaviour appears to have been an intentional decision to simplify the hardware at the expense of worse behaviour. It has no limit on max instruction length, and avoids remembering the start of an instruction at all, anywhere inside the CPU (Ken Shirriff quotes an Intel patent in an answer on Interrupts, Instruction Pointer, and Instruction Queue in 8086).
    One of Ken's blog articles on reverse-engineering 8086 looks at how it decodes instructions, with microcode pulling bytes from the prefetch queue after being invoked with an opcode or opcode+modrm, nothing tracking the total length.

    If cs rep movsb is interrupted, 8086's interrupt-return address is before the final prefix, not the actual instruction start like later CPUs. (i.e. it would resume as rep movsb without the cs prefix, which is a disaster if you put the prefixes in that order. This is the nastiest "worse behaviour" I know of; you can maybe work around it by putting rep cs movsb inside a loop, especially if copying a few extra bytes or words at the end is ok so you can just use test cx,cx/jnz repeat. The interrupt-resumed cs movsb will still advance SI and DI but not decrement CX. Recomputing CX from a pointer every loop iteration would usually be worse than just saving/restore DS so you can use plain rep movsb/w.)
    Since 8086 doesn't have any kind of page-faults or configurable segment-limits, it can't take a synchronous exception during rep cs movsb or other rep-string instructions, only async external interrupts. So you could potentially cli / sti around a copy, if you can assume no NMIs.

    See Why do call and jump instruction use a displacement relative to the next instruction, not current? for more guesswork about 8086 design decisions.