operating-systeminterruptqemu

"sti" causing "Invalid Opcode" on real hardware; working perfectly in QEMU


I have an issue similar to this post. In summary, I am working on an OS and following an old tutorial written by Carlos Fenollosa. I am attempting to setup an IDT and handle IRQs, and fortunately I haven't had much trouble when working in QEMU. However, on real hardware, my code hangs the second interrupts are re-enabled. Specifically, my isr_handler receives a single "Invalid opcode" interrupt before halting.

According to the linked post, the poster solved this issue by adding in a missing handler and including an end of interrupt routine. Following the tutorial, I include this:

// Sets first 32 IDT gates
void isr_install() {
    set_idt_gate(0,  (u32) isr0);
    set_idt_gate(1,  (u32) isr1);
    set_idt_gate(2,  (u32) isr2);
    set_idt_gate(3,  (u32) isr3);
    set_idt_gate(4,  (u32) isr4);
    set_idt_gate(5,  (u32) isr5);
    set_idt_gate(6,  (u32) isr6);
    set_idt_gate(7,  (u32) isr7);
    set_idt_gate(8,  (u32) isr8);
    set_idt_gate(9,  (u32) isr9);
    set_idt_gate(10, (u32) isr10);
    set_idt_gate(11, (u32) isr11);
    set_idt_gate(12, (u32) isr12);
    set_idt_gate(13, (u32) isr13);
    set_idt_gate(14, (u32) isr14);
    set_idt_gate(15, (u32) isr15);
    set_idt_gate(16, (u32) isr16);
    set_idt_gate(17, (u32) isr17);
    set_idt_gate(18, (u32) isr18);
    set_idt_gate(19, (u32) isr19);
    set_idt_gate(20, (u32) isr20);
    set_idt_gate(21, (u32) isr21);
    set_idt_gate(22, (u32) isr22);
    set_idt_gate(23, (u32) isr23);
    set_idt_gate(24, (u32) isr24);
    set_idt_gate(25, (u32) isr25);
    set_idt_gate(26, (u32) isr26);
    set_idt_gate(27, (u32) isr27);
    set_idt_gate(28, (u32) isr28);
    set_idt_gate(29, (u32) isr29);
    set_idt_gate(30, (u32) isr30);
    set_idt_gate(31, (u32) isr31);

     // Remap the PIC (note 0x2X is parent, 0xAX is child)
    set_port_byte(0x20, 0x11);
    set_port_byte(0xA0, 0x11);

    set_port_byte(0x21, 0x20);
    set_port_byte(0xA1, 0x28);

    set_port_byte(0x21, 0x04);
    set_port_byte(0xA1, 0x02);

    set_port_byte(0x21, 0x01);
    set_port_byte(0xA1, 0x01);

    set_port_byte(0x21, 0x0);
    set_port_byte(0xA1, 0x0); 

    // Install the IRQs
    set_idt_gate(32, (u32) irq0);
    set_idt_gate(33, (u32) irq1);
    set_idt_gate(34, (u32) irq2);
    set_idt_gate(35, (u32) irq3);
    set_idt_gate(36, (u32) irq4);
    set_idt_gate(37, (u32) irq5);
    set_idt_gate(38, (u32) irq6);
    set_idt_gate(39, (u32) irq7);
    set_idt_gate(40, (u32) irq8);
    set_idt_gate(41, (u32) irq9);
    set_idt_gate(42, (u32) irq10);
    set_idt_gate(43, (u32) irq11);
    set_idt_gate(44, (u32) irq12);
    set_idt_gate(45, (u32) irq13);
    set_idt_gate(46, (u32) irq14);
    set_idt_gate(47, (u32) irq15);

    set_idt();
}

/*
Runs when interrupt occurs
*/
void isr_handler(registers_t* reg) {
    // Debug messages; how I know it is an invalid opcode error
}

/*
Handles interrupt requests
*/
void irq_handler(registers_t* reg) {
    if (reg->int_no > 39) {
        set_port_byte(0xA0, 0x20);  // Send EOI command
    }
    set_port_byte(0x20, 0x20);      // Send EOI command

    if (interrupt_handlers[reg->int_no] != 0) {
        interrupt_handlers[reg->int_no](reg);    // Call handler
    }
}

However, it still fails on real hardware while simultaneously providing the desired results on QEMU. Can anyone please explain this inconsistency and provide how to solve this issue? Apologies if I am overlooking a trivial fix.

Update: Based on Hammdist's suggestions, here are all of my debug prints for all values held in registers_t:

ds          = -1028128752 (0xC2B8A1A0)
edi         = 9248
esi         = 37035
ebp         = 589720
ebx         = 65535
edx         = 32
ecx         = 0
eax         = -1028128752 (0xC2B8A1A0)
int_no      = 6
err_code    = 0
eip         = -1028128752 (0xC2B8A1A0)
cs          = 8
eflags      = 65558
esp         = 12857
ss          = 589732

Additionally, here is the ISR routine I am using:

; Common ISR code
isr_common_stub:
    ; 1. Save CPU state
    pusha
    mov ax, ds
    push eax
    mov ax, 0x10
    mov ds, ax
    mov es, ax
    mov fs, ax
    mov gs, ax
    push esp ; registers_t* r
    ; 2. Call C handler
    cld
    call isr_handler
    
    ; 3. Restore state
    pop eax 
    pop eax
    mov ds, ax
    mov es, ax
    mov fs, ax
    mov gs, ax
    popa
    add esp, 8 ; Cleans up the pushed error code and pushed ISR number
    iret

Solution

  • Unfortunately, it turns out the question I asked does not provide sufficient context to solve this issue!

    It turns out the error was not caused solely by my ISR handler, but also by my IRQ handler. Specifically, due to a bug in my bootloader code, the .bss section was not properly initialized to all 0s:

    void irq_handler(registers_t *r) {
        if (r->int_no >= 40) set_port_byte(0xA0, 0x20);
        set_port_byte(0x20, 0x20);
    
        if (interrupt_handlers[r->int_no] != 0) { // This line would fail!
            isr_t handler = interrupt_handlers[r->int_no];
            handler(r);
        }
    }
    

    Hence, my code kept trying to call some random addresses in memory with obviously no luck, leading to infinite, weird, exceptions. In QEMU, the bss section is initially set to all 0s, hence there was no sign of failure before booting on real hardware