macosassemblynasmosdevmultiboot

NASM and clang/LLVM generating different object files


I'm trying to make a simple kernel with multiboot. I got the multiboot header working in NASM, but now I'm trying to rewrite it in GNU AS syntax. I think problem is that clang (as on MacOS) is placing the multiboot header at a different address (beyond 8K), but I can't figure out how to get it to work the same as NASM. I'm using the same linker script.

Below is my NASM code, GAS code, linker script, and the output of nm kernel-nasm.bin kernel-gas.bin (sorry for the verbosity).

Here's the working NASM code:

MBALIGN  equ  1 << 0
MEMINFO  equ  1 << 1
FLAGS    equ  MBALIGN | MEMINFO
MAGIC    equ  0x1BADB002
CHECKSUM equ -(MAGIC + FLAGS)
 
section .multiboot_header
header_start:
align 4
    dd MAGIC
    dd FLAGS
    dd CHECKSUM
header_end:

section .text
global start
start:
    mov dword [0xb8000], 0x2f4b2f4f
    hlt

And here's the not working GNU AS code:

.set MBALIGN,  1 << 0
.set MEMINFO, 1 << 1
.set FLAGS, MBALIGN | MEMINFO
.set MAGIC, 0x1BADB002
.set CHECKSUM, -(MAGIC + FLAGS)
 
.section .multiboot_header
header_start:
.align 4
    .long MAGIC
    .long FLAGS
    .long CHECKSUM
header_end:

.section .text
.global start
start:
    movl $0x2f4b2f4f, (0xb8000)
    hlt

Linker Script:

ENTRY(start)

SECTIONS {
    . = 1M;

    .boot : ALIGN(4K)
    {
        /* ensure that the multiboot header is at the beginning */
        *(.multiboot_header)
    }

    .text : ALIGN (4K)
    {
        *(.text)
    }
}

Output of nm kernel-nasm.bin kernel-gas.bin:

kernel-nasm.bin:
e4524ffb a CHECKSUM
00000003 a FLAGS
1badb002 a MAGIC
00000001 a MBALIGN
00000002 a MEMINFO
0010000c r header_end
00100000 r header_start
00101000 T start

kernel-gas.bin:
e4524ffb a CHECKSUM
00000003 a FLAGS
1badb002 a MAGIC
00000001 a MBALIGN
00000002 a MEMINFO
0000000c n header_end
00000000 n header_start
00100000 T start

Here's the commands I'm using to assemble the code. I'm using Homebrew's LLVM 14.0.6 on macOS:

# For kernel-nasm.bin
nasm -felf32 kernel-nasm.asm -o kernel-nasm.o
ld.lld -n -o kernel-nasm.bin -T linker.ld kernel-nasm.o

# For kernel-gas.bin
as --target=i386-pc-none-elf kernel-gas.S -o kernel-gas.o
ld.lld -n -o kernel-gas.bin -T linker.ld kernel-gas.o

As you can see from the --target= option, as on this machine is clang, not from GNU Binutils. Same for the ld.lld linker being LLVM, not Binutils.

The output of objdump -x kernel-nasm.bin is:

kernel-nasm.bin:     file format elf32-i386
kernel-nasm.bin
architecture: i386, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x00101000

Program Header:
    LOAD off    0x00001000 vaddr 0x00100000 paddr 0x00100000 align 2**12
         filesz 0x0000000c memsz 0x0000000c flags r--
    LOAD off    0x00002000 vaddr 0x00101000 paddr 0x00101000 align 2**12
         filesz 0x0000000b memsz 0x0000000b flags r-x
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**0
         filesz 0x00000000 memsz 0x00000000 flags rw-

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .boot         0000000c  00100000  00100000  00001000  2**12
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .text         0000000b  00101000  00101000  00002000  2**12
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .comment      0000001c  00000000  00000000  0000200b  2**0
                  CONTENTS, READONLY
SYMBOL TABLE:
00000000 l    df *ABS*  00000000 hdr.asm
00000001 l       *ABS*  00000000 MBALIGN
00000002 l       *ABS*  00000000 MEMINFO
00000003 l       *ABS*  00000000 FLAGS
1badb002 l       *ABS*  00000000 MAGIC
e4524ffb l       *ABS*  00000000 CHECKSUM
00100000 l       .boot  00000000 header_start
0010000c l       .boot  00000000 header_end
00101000 g       .text  00000000 start

The output of objdump -x kernel-gas.bin is:

kernel-gas.bin:     file format elf32-i386
kernel-gas.bin
architecture: i386, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x00100000

Program Header:
    LOAD off    0x00001000 vaddr 0x00100000 paddr 0x00100000 align 2**12
         filesz 0x0000000b memsz 0x0000000b flags r-x
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**0
         filesz 0x00000000 memsz 0x00000000 flags rw-

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .boot         0000000c  00000000  00000000  00002000  2**12
                  CONTENTS, READONLY
  1 .comment      0000001c  00000000  00000000  0000200c  2**0
                  CONTENTS, READONLY
  2 .text         0000000b  00100000  00100000  00001000  2**12
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
SYMBOL TABLE:
e4524ffb l       *ABS*  00000000 CHECKSUM
00000003 l       *ABS*  00000000 FLAGS
1badb002 l       *ABS*  00000000 MAGIC
00000001 l       *ABS*  00000000 MBALIGN
00000002 l       *ABS*  00000000 MEMINFO
0000000c l       .boot  00000000 header_end
00000000 l       .boot  00000000 header_start
00100000 g       .text  00000000 start

Solution

  • According to the GNU AS documentation, "If the section name is not recognized, the default will be for the section to have none of the above flags: it will not be allocated in memory, nor writable, nor executable. The section will contain data."

    To make sure the .boot section is loaded into memory and can be read by the bootloader, the section must have the "a" flag added to it (more info in the documentation above). Like this:

    // ... code ...
     
    .section .multiboot_header, "a"
    header_start:
    .align 4
        .long MAGIC
        .long FLAGS
        .long CHECKSUM
    header_end:
    
    // ... code ...