assemblyx86nasmbinaryfilesqemu

Loading and running a raw binary file with 32-bit code in QEMU


I have an assembly (NASM) file:

bits 32
start:
    mov dword [0xb8000], 0x2f4b2f4f
    hlt

This produces a binary file containing:

C7 05 00 80 0B 00 4F 2F 4B 2F F4

Is there a way of executing the code in this binary file in QEMU without changing the code and without adding headers (like Multiboot or similar)? I wish to keep the binary file as it is.


Solution

  • I don't know why you don't wish to use ELF executables for your code especially if you are writing a kernel.

    There is no way of getting QEMU to directly run an arbitrary binary file. As well, the code you have would need to be in 32-bit protected mode to run properly (since it uses bits 32). The only requirements you have given are that it run in QEMU and the binary file with instructions and/or data in it not be modified.

    The Multiboot specification allows you to use simple ELF files with a proper Multiboot header. Using multiboot is the best option here since it already sets up 32-bit protected mode; Enables the A20 line; and loads data and code into memory. This is a lot of work you don't have to code yourself.

    What you can do is create a Multiboot wrapper in NASM that has a simple Multiboot header in it and includes the binary file with code in it. The following example will create a Multiboot compliant ELF executable that has the header at 0x100000 in memory and the kernel at 0x101000. This code uses NASM's incbin directive to include a binary file directly into the assembly file containing the Multiboot header and the entry point.

    mboot.asm

    bits 32
    global _start
    
    MB1_MAGIC    equ 0x1badb002
    MB1_FLAGS    equ 0x00000000
    MB1_CHECKSUM equ -(MB1_MAGIC+MB1_FLAGS)
    
    section .data
    align 4
        dd MB1_MAGIC
        dd MB1_FLAGS
        dd MB1_CHECKSUM
    
    section .text
    _start:
        incbin "kernel.bin"
    

    kernel.asm

    bits 32
    start:
        mov dword [0xb8000], 0x2f4b2f4f
        hlt
    

    You first have to build your kernel.asm to a binary file called kernel.bin with:

    nasm -fbin kernel.asm -o kernel.bin
    

    Then you have to assemble the multiboot wrapper with:

    nasm -felf32 mboot.asm -o mboot.o
    

    Finally link it to an ELF executable called kernel.elf with:

    ld -Ttext=0x101000 -Tdata=0x100000 -melf_i386 mboot.o -o kernel.elf
    

    This Multiboot compliant ELF executable can be run in QEMU with the -kernel option like this:

    qemu-system-i386 -kernel kernel.elf
    

    The output when run should look something like:

    enter image description here


    Additional Note

    As your kernel binary grows and you need to make references to absolute memory locations via labels, you will need to tell NASM that the virtual memory address (VMA) where your kernel binary is loaded is at 0x101000. That can be done using the ORG directive like this:

    bits 32
    org 0x101000
    
    start:
        mov dword [0xb8000], 0x2f4b2f4f
        hlt
    

    This code doesn't require the ORG directive because it doesn't need the absolute address of a label anywhere in the code, but this may change as your code expands in the future. An example of code that won't work if you don't properly specify the ORG (origin point) is as follows:

    ; Example program that uses an absolute reference to a label
    ; that won't work unless a proper ORG is used. Removing the ORG
    ; or using the wrong value will cause the code to not work as
    ; expected
    
    org 0x101000
    bits 32
    start:
        mov eax, [okmsg]          ; Using an absolute reference to a label
        mov dword [0xb8000], eax  ; Write value to display
        hlt
    
    okmsg: dd  0x2f4b2f4f
    

    As stated at the beginning of the answer I don't really recommend this approach but I am providing a solution that allows you to run the code you presented that was placed in a binary file that you say can't be changed.


    Floppy bootloader that loads and runs a protected mode kernel

    If you don't want to use Multiboot and wish to create a disk image this rather simplified bootloader that:

    This code is based on code from a few of my other answers. One is a bootloader that executes code in 16-bit real mode. I modified it with code I used in a question that enables A20 and enters protected mode.

    boot.asm

    STAGE2_ABS_ADDR  equ 0x08000
    STAGE2_RUN_SEG   equ 0x0000
    STAGE2_RUN_OFS   equ STAGE2_ABS_ADDR
                                    ; Run stage2 with segment of 0x0000 and offset of 0x8000
    
    STAGE2_LOAD_SEG  equ STAGE2_ABS_ADDR>>4
                                    ; Segment to start reading Stage2 into
                                    ;     right after bootloader
    
    STAGE2_LBA_START equ 1          ; Logical Block Address(LBA) Stage2 starts on
                                    ;     LBA 1 = sector after boot sector
    STAGE2_LBA_END   equ STAGE2_LBA_START + NUM_STAGE2_SECTORS
                                    ; Logical Block Address(LBA) Stage2 ends at
    DISK_RETRIES     equ 3          ; Number of times to retry on disk error
    
    bits 16
    ORG 0x7c00
    
    ; Include a BPB (1.44MB floppy with FAT12) to be more compatible with USB floppy media
    %ifdef WITH_BPB
    %include "bpb.inc"
    %endif
    
    boot_continue:
        xor ax, ax                  ; DS=SS=0 for stage2 loading
        mov ds, ax
        mov ss, ax                  ; Stack at 0x0000:0x7c00
        mov sp, 0x7c00
        cld                         ; Set string instructions to use forward movement
    
        ; Read Stage2 1 sector at a time until stage2 is completely loaded
    load_stage2:
        mov [bootDevice], dl        ; Save boot drive
        mov di, STAGE2_LOAD_SEG     ; DI = Current segment to read into
        mov si, STAGE2_LBA_START    ; SI = LBA that stage2 starts at
        jmp .chk_for_last_lba       ; Check to see if we are last sector in stage2
    
    .read_sector_loop:
        mov bp, DISK_RETRIES        ; Set disk retry count
    
        call lba_to_chs             ; Convert current LBA to CHS
        mov es, di                  ; Set ES to current segment number to read into
        xor bx, bx                  ; Offset zero in segment
    
    .retry:
        mov ax, 0x0201              ; Call function 0x02 of int 13h (read sectors)
                                    ;     AL = 1 = Sectors to read
        int 0x13                    ; BIOS Disk interrupt call
        jc .disk_error              ; If CF set then disk error
    
    .success:
        add di, 512>>4              ; Advance to next 512 byte segment (0x20*16=512)
        inc si                      ; Next LBA
    
    .chk_for_last_lba:
        cmp si, STAGE2_LBA_END      ; Have we reached the last stage2 sector?
        jl .read_sector_loop        ;     If we haven't then read next sector
    
    .stage2_loaded:
        mov si, noa20_err           ; Default error message to A20 enable error
        call a20_enable             ; Enable A20 line
        jz error_print              ; If the A20 line isn't enabled, print error and stop
    
        lgdt [gdtr]                 ; Load GDT for 32-bit protected mode
    
        cli                         ; Disable interrupts since we don't have an IDT setup
        mov eax, cr0                ; Read CR0 register
        or eax, 1                   ; Enable protected mode flage (bit 0)
        mov cr0, eax                ; Set CR0 register&enter quasi 16-bit protected mode
        jmp CODE32_SEL:start32pm    ; FAR JMP to use a 32-bit code selector
                                    ;     This enters 32-bit protected mode @ start32pm
    
    .disk_error:
        xor ah, ah                  ; Int13h/AH=0 is drive reset
        int 0x13
        dec bp                      ; Decrease retry count
        jge .retry                  ; If retry count not exceeded then try again
    
    disk_error_end:
        ; Unrecoverable error; print drive error; enter infinite loop
        mov si, diskErrorMsg        ; Display disk error message
    
    error_print:
        call print_string
        cli
    error_loop:
        hlt
        jmp error_loop
    
    ; Function: print_string
    ;           Display a string to the console on display page 0
    ;
    ; Inputs:   SI = Offset of address to print
    ; Clobbers: AX, BX, SI
    
    print_string:
        mov ah, 0x0e                ; BIOS tty Print
        xor bx, bx                  ; Set display page to 0 (BL)
        jmp .getch
    .repeat:
        int 0x10                    ; print character
    .getch:
        lodsb                       ; Get character from string
        test al,al                  ; Have we reached end of string?
        jnz .repeat                 ;     if not process next character
    .end:
        ret
    
    ;    Function: lba_to_chs
    ; Description: Translate Logical block address to CHS (Cylinder, Head, Sector).
    ;
    ;   Resources: http://www.ctyme.com/intr/rb-0607.htm
    ;              https://en.wikipedia.org/wiki/Logical_block_addressing#CHS_conversion
    ;              https://stackoverflow.com/q/45434899/3857942
    ;              Sector    = (LBA mod SPT) + 1
    ;              Head      = (LBA / SPT) mod HEADS
    ;              Cylinder  = (LBA / SPT) / HEADS
    ;
    ;      Inputs: SI = LBA
    ;     Outputs: DL = Boot Drive Number
    ;              DH = Head
    ;              CH = Cylinder (lower 8 bits of 10-bit cylinder)
    ;              CL = Sector/Cylinder
    ;                   Upper 2 bits of 10-bit Cylinders in upper 2 bits of CL
    ;                   Sector in lower 6 bits of CL
    ;
    ;       Notes: Output registers match expectation of Int 13h/AH=2 inputs
    ;
    lba_to_chs:
        push ax                    ; Preserve AX
        mov ax, si                 ; Copy LBA to AX
        xor dx, dx                 ; Upper 16-bit of 32-bit value set to 0 for DIV
        div word [sectorsPerTrack] ; 32-bit by 16-bit DIV : LBA / SPT
        mov cl, dl                 ; CL = S = LBA mod SPT
        inc cl                     ; CL = S = (LBA mod SPT) + 1
        xor dx, dx                 ; Upper 16-bit of 32-bit value set to 0 for DIV
        div word [numHeads]        ; 32-bit by 16-bit DIV : (LBA / SPT) / HEADS
        mov dh, dl                 ; DH = H = (LBA / SPT) mod HEADS
        mov dl, [bootDevice]       ; boot device, not necessary to set but convenient
        mov ch, al                 ; CH = C(lower 8 bits) = (LBA / SPT) / HEADS
        shl ah, 6                  ; Store upper 2 bits of 10-bit Cylinder into
        or  cl, ah                 ;     upper 2 bits of Sector (CL)
        pop ax                     ; Restore scratch registers
        ret
    
    ; Function: wait_8042_cmd
    ;           Wait until the Input Buffer Full bit in the keyboard controller's
    ;           status register becomes 0. After calls to this function it is
    ;           safe to send a command on Port 0x64
    ;
    ; Inputs:   None
    ; Clobbers: AX
    ; Returns:  None
    
    KBC_STATUS_IBF_BIT EQU 1
    wait_8042_cmd:
        in al, 0x64                ; Read keyboard controller status register
        test al, 1 << KBC_STATUS_IBF_BIT
                                   ; Is bit 1 (Input Buffer Full) set?
        jnz wait_8042_cmd          ;     If it is then controller is busy and we
                                   ;     can't send command byte, try again
        ret                        ; Otherwise buffer is clear and ready to send a command
    
    ; Function: wait_8042_data
    ;           Wait until the Output Buffer Empty (OBE) bit in the keyboard controller's
    ;           status register becomes 0. After a call to this function there is
    ;           data available to be read on port 0x60.
    ;
    ; Inputs:   None
    ; Clobbers: AX
    ; Returns:  None
    
    KBC_STATUS_OBE_BIT EQU 0
    wait_8042_data:
        in al, 0x64                ; Read keyboard controller status register
        test al, 1 << KBC_STATUS_OBE_BIT
                                   ; Is bit 0 (Output Buffer Empty) set?
        jz wait_8042_data          ;     If not then no data waiting to be read, try again
        ret                        ; Otherwise data is ready to be read
    
    ; Function: a20_kbd_enable
    ;           Enable the A20 line via the keyboard controller
    ;
    ; Inputs:   None
    ; Clobbers: AX, CX
    ; Returns:  None
    
    a20_kbd_enable:
        pushf
        cli                        ; Disable interrupts
    
        call wait_8042_cmd         ; When controller ready for command
        mov al, 0xad               ; Send command 0xad (disable keyboard).
        out 0x64, al
    
        call wait_8042_cmd         ; When controller ready for command
        mov al, 0xd0               ; Send command 0xd0 (read output port)
        out 0x64, al
    
        call wait_8042_data        ; Wait until controller has data
        in al, 0x60                ; Read data from keyboard
        mov cx, ax                 ;     CX = copy of byte read
    
        call wait_8042_cmd         ; Wait until controller is ready for a command
        mov al, 0xd1
        out 0x64, al               ; Send command 0xd1 (write output port)
    
        call wait_8042_cmd         ; Wait until controller is ready for a command
        mov ax, cx
        or al, 1 << 1              ; Write value back with bit 1 set
        out 0x60, al
    
        call wait_8042_cmd         ; Wait until controller is ready for a command
        mov al, 0xae
        out 0x64, al               ; Write command 0xae (enable keyboard)
    
        call wait_8042_cmd         ; Wait until controller is ready for command
        popf                       ; Restore flags including interrupt flag
        ret
    
    ; Function: a20_fast_enable
    ;           Enable the A20 line via System Control Port A
    ;
    ; Inputs:   None
    ; Clobbers: AX
    ; Returns:  None
    
    a20_fast_enable:
        in al, 0x92                ; Read System Control Port A
        test al, 1 << 1
        jnz .finished              ; If bit 1 is set then A20 already enabled
        or al, 1 << 1              ; Set bit 1
        and al, ~(1 << 0)          ; Clear bit 0 to avoid issuing a reset
        out 0x92, al               ; Send Enabled A20 and disabled Reset to control port
    .finished:
        ret
    
    ; Function: a20_bios_enable
    ;           Enable the A20 line via the BIOS function Int 15h/AH=2401
    ;
    ; Inputs:   None
    ; Clobbers: AX
    ; Returns:  None
    
    a20_bios_enable:
        mov ax, 0x2401             ; Int 15h/AH=2401 enables A20 on BIOS with this feature
        int 0x15
        ret
    
    ; Function: a20_check
    ;           Determine if the A20 line is enabled or disabled
    ;
    ; Inputs:   None
    ; Clobbers: AX, CX, ES
    ; Returns:  ZF=1 if A20 enabled, ZF=0 if disabled
    
    a20_check:
        pushf                      ; Save flags so Interrupt Flag (IF) can be restored
        push ds                    ; Save volatile registers
        push si
        push di
    
        cli                        ; Disable interrupts
        xor ax, ax
        mov ds, ax
        mov si, 0x600              ; 0x0000:0x0600 (0x00600) address we will test
    
        mov ax, 0xffff
        mov es, ax
        mov di, 0x610              ; 0xffff:0x0610 (0x00600) address we will test
                                   ; The physical address pointed to depends on whether
                                   ; memory wraps or not. If it wraps then A20 is disabled
    
        mov cl, [si]               ; Save byte at 0x0000:0x0600
        mov ch, [es:di]            ; Save byte at 0xffff:0x0610
    
        mov byte [si], 0xaa        ; Write 0xaa to 0x0000:0x0600
        mov byte [es:di], 0x55     ; Write 0x55 to 0xffff:0x0610
    
        xor ax, ax                 ; Set return value 0
        cmp byte [si], 0x55        ; If 0x0000:0x0600 is 0x55 and not 0xaa
        je .disabled               ;     then memory wrapped because A20 is disabled
    
        dec ax                     ; A20 Disable, set AX to -1
    .disabled:
        ; Cleanup by restoring original bytes in memory. This must be in reverse
        ; order from the order they were originally saved
        mov [es:di], ch            ; Restore data saved data to 0xffff:0x0610
        mov [si], cl               ; Restore data saved data to 0x0000:0x0600
    
        pop di                     ; Restore non-volatile registers
        pop si
        pop ds
        popf                       ; Restore Flags (including IF)
        test al, al                ; Return ZF=1 if A20 enabled, ZF=0 if disabled
        ret
    
    ; Function: a20_enable
    ;           Enable the A20 line
    ;
    ; Inputs:   None
    ; Clobbers: AX, BX, CX, DX
    ; Returns:  ZF=0 if A20 not enabled, ZF=1 if A20 enabled
    
    a20_enable:
        call a20_check             ; Is A20 already enabled?
        jnz .a20_on                ;     If so then we're done ZF=1
    
        call a20_bios_enable       ; Try enabling A20 via BIOS
        call a20_check             ; Is A20 now enabled?
        jnz .a20_on                ;     If so then we're done ZF=1
    
        call a20_kbd_enable        ; Try enabling A20 via keyboard controller
        call a20_check             ; Is A20 now enabled?
        jnz .a20_on                ;     If so then we're done ZF=1
    
        call a20_fast_enable       ; Try enabling A20 via fast method
        call a20_check             ; Is A20 now enabled?
        jnz .a20_on                ;     If so then we're done ZF=1
    .a20_err:
        xor ax, ax                 ; If A20 disabled then return with ZF=0
    .a20_on:
        ret
    
    bits 32
    start32pm:
        mov ax, DATA32_SEL         ; Set up the 32-bit data selectors
        mov ds, ax
        mov es, ax
        mov fs, ax
        mov gs, ax
    
        ; Zero extend SP to ESP. SP is already at 0x7c00
        ; DL still contains the boot drive number
        movzx esp, sp
    
        ; Execute stage2 code
        jmp STAGE2_RUN_OFS
    
    ; 32-bit GDT for protected mode
    ; Macro to build a GDT descriptor entry
    %define MAKE_GDT_DESC(base, limit, access, flags)  \
        (((base & 0x00FFFFFF) << 16) |  \
        ((base & 0xFF000000) << 32) |  \
        (limit & 0x0000FFFF) |      \
        ((limit & 0x000F0000) << 32) |  \
        ((access & 0xFF) << 40) |  \
        ((flags & 0x0F) << 52))
    
    ; GDT structure
    gdt_start:
        dq MAKE_GDT_DESC(0, 0, 0, 0); null descriptor
    gdt32_code:
        dq MAKE_GDT_DESC(0, 0x000fffff, 10011010b, 1100b)
                                   ; 32-bit code, 4kb gran, limit 0xffffffff bytes, base=0
    gdt32_data:
        dq MAKE_GDT_DESC(0, 0x000fffff, 10010010b, 1100b)
                                   ; 32-bit data, 4kb gran, limit 0xffffffff bytes, base=0
    gdt_end:
    
    CODE32_SEL equ gdt32_code - gdt_start
    DATA32_SEL equ gdt32_data - gdt_start
    
    ; GDT record
    align 4
        dw 0                       ; Padding align dd GDT in gdtr on 4 byte boundary
    gdtr:
        dw gdt_end - gdt_start - 1
                                   ; limit (Size of GDT - 1)
        dd gdt_start               ; base of GDT
    
    ; If not using a BPB (via bpb.inc) provide default Heads and SPT values
    %ifndef WITH_BPB
    numHeads:        dw 2          ; 1.44MB Floppy has 2 heads & 18 sector per track
    sectorsPerTrack: dw 18
    %endif
    
    bootDevice:      db 0x00
    diskErrorMsg:    db "Unrecoverable disk error!", 0
    noa20_err:       db "A20 line couldn't be enabled", 10, 13, 0
    
    ; Pad boot sector to 510 bytes and add 2 byte boot signature for 512 total bytes
    TIMES 510-($-$$) db  0
    dw 0xaa55
    
    ; Beginning of stage2. This is at 0x8000 and will allow your stage2 to be 32.5KiB
    ; before running into problems. DL will be set to the drive number originally
    ; passed to us by the BIOS.
    
    NUM_STAGE2_SECTORS equ (stage2_end-stage2_start+511) / 512
                                   ; Number of 512 byte sectors stage2 uses.
    
    stage2_start:
        ; Insert stage2 binary here. It is done this way since we
        ; can determine the size(and number of sectors) to load since
        ;     Size = stage2_end-stage2_start
        incbin "kernel.bin"
    
    ; End of stage2. Make sure this label is LAST in this file!
    stage2_end:
    
    ; Fill out this file to produce a 1.44MB floppy image
    TIMES 1024*1440-($-$$) db 0x00
    

    First build kernel.bin from your code in kernel.asm:

    nasm -f bin kernel.asm -o kernel.bin
    

    Create a 1.44MiB floppy (disk.img) that contains the code and data in kernel.bin and a Volume Boot Record (VBR) that reads the kernel into memory:

    nasm -f bin boot.asm -o disk.img
    

    It can be run in QEMU from the floppy disk image using:

    qemu-system-i386 -fda disk.img
    

    This version of the code may eventually require an ORG 0x8000 in your kernel instead of the Multiboot version I presented earlier that may have required ORG 0x101000 as you continue to develop your kernel. An example of code that won't work if you don't properly specify the ORG (origin point) is as follows:

    ; Example program that uses an absolute reference to a label
    ; that won't work unless a proper ORG is used. Removing the ORG
    ; or using the wrong value will cause the code to not work as
    ; expected
    
    org 0x8000
    bits 32
    start:
        mov eax, [okmsg]          ; Using an absolute reference to a label
        mov dword [0xb8000], eax  ; Write value to display
        hlt
    
    okmsg: dd  0x2f4b2f4f