assemblylinkerarmqemuarm-none-eabi-gcc

Why does changing the order of object files using the arm-none-eabi-ld linker change the executable behavior?


I am getting different behavior when using

arm-none-eabi-ld -T t.ld -o t.elf t.o ts.o

to link my object files, vs

arm-none-eabi-ld -T t.ld -o t.elf ts.o t.o

where the object files 't.o' and 'ts.o' are transposed in the command. The latter version yields correct behavior while the earlier does not. The difference appears to be the stack pointer in my program is set incorrectly with the first version, and I would like to know why this is the case.

Here are the source files and linker script I am using, and script to compile.

t.ld

ENTRY(start) /* define start as the entry address */
SECTIONS
{
    . = 0x10000; /* loading address, required by QEMU */
    .text : { *(.text) }
    .data : { *(.data) }
    .bss : { *(.bss) }
    . =ALIGN(8);
        . =. + 0x1000;
    stack_top =.;
}

t.c

int g = 100; // un-initialized global

extern int sum(int a, int b, int c, int d, int e, int f);

int main() {
    int a, b, c, d, e, f; // local variables
    a = b = c = d = e = f = 1; // values do not matter
    g = sum(a, b, c, d, e, f); // call sum()
}

ts.s

/*
    Assembly file to define sum()
 */
    .global start, sum
start:
    ldr sp, =stack_top // set sp to stack top
    bl main // call main()

stop: b stop // loop

sum:
    // establish stack frame
    stmfd sp!, {fp, lr} // push lr and fp
    add fp, sp, #4 // fp -> saved lr on stack
    // compute sum of all 6 parameters
    add r0, r0, r1 // r0 = a + b
    add r0, r0, r2 // r0 = a + b + c
    add r0, r0, r3 // r0 = a + b + c + d
    ldr r3, [fp, #4] // r1 = e
    add r0, r0, r3 // r0 = a + b + c + d + e
    ldr r3, [fp, #8] // r1 = f
    add r0, r0, r3 // r0 = a + b + c + d + e + f
    // return
    sub sp, fp, #4 // point stack pointer to saved fp
    ldmfd sp!, {fp, pc} // return to caller

mk.sh (with linker command that yields expected results)

arm-none-eabi-as -o ts.o ts.s # assemble ts.s
arm-none-eabi-gcc -c t.c # cross-compile t.c into t.o
arm-none-eabi-ld -T t.ld -o t.elf ts.o t.o # link object files into t.elf
arm-none-eabi-objcopy -O binary t.elf t.bin # convert t.elf to t.bin

After running the binary with

qemu-system-arm -M versatilepb -kernel t.bin -nographic -serial /dev/null

I get the following. The stack pointer (R13) is correct

(qemu) info registers
R00=00000000 R01=00000001 R02=000100c0 R03=00000000
R04=00000000 R05=00000000 R06=00000000 R07=00000000
R08=00000000 R09=00000000 R10=00000000 R11=00000000
R12=00000000 R13=000110c8 R14=00010008 R15=00010008
PSR=400001d3 -Z-- A svc32
FPSCR: 00000000

VS the results using the linker command with transposed object files

(qemu) info registers
R00=00000000 R01=00000183 R02=00000100 R03=00000000
R04=00000000 R05=00000000 R06=00000000 R07=00000000
R08=00000000 R09=00000000 R10=00000000 R11=f3575ee4
R12=00000000 R13=f3575ec0 R14=00010060 R15=00010000
PSR=400001d3 -Z-- A svc32
FPSCR: 00000000

Where the stack pointer(R13) is clearly outside the memory range of the program.


Solution

  • Even simpler:

    flash.s

    .global _start
    _start:
        ldr sp,=0x11000
        bl main
        b .
    

    flash.ld

    ENTRY(_start)
    
    MEMORY
    {
        ram : ORIGIN = 0x10000, LENGTH = 0x1000
    }
    SECTIONS
    {
        .text   : { *(.text*)   } > ram
        .rodata : { *(.rodata*) } > ram
        .bss    : { *(.bss*)    } > ram
        .data   : { *(.data*)   } > ram
    }
    

    so.c

    int  main ( void )
    {   
        return 5;
    }
    

    build

    arm-none-eabi-as --warn --fatal-warnings  flash.s -o flash.o
    arm-none-eabi-gcc -c -Wall -O2 -ffreestanding  so.c -o so.o
    arm-none-eabi-ld -nostdlib -nostartfiles -T flash.ld flash.o so.o -o one.elf
    arm-none-eabi-objdump -D one.elf > one.list
    arm-none-eabi-objcopy -O binary one.elf one.bin
    arm-none-eabi-ld -nostdlib -nostartfiles -T flash.ld so.o flash.o -o two.elf
    arm-none-eabi-objdump -D two.elf > two.list
    arm-none-eabi-objcopy -O binary two.elf two.bin
    

    Examine:

    one.elf:     file format elf32-littlearm
    
    
    Disassembly of section .text:
    
    00010000 <_start>:
       10000:   e3a0da11    mov sp, #69632  ; 0x11000
       10004:   eb000000    bl  1000c <main>
       10008:   eafffffe    b   10008 <_start+0x8>
    
    0001000c <main>:
       1000c:   e3a00005    mov r0, #5
       10010:   e12fff1e    bx  lr
    
    
    two.elf:     file format elf32-littlearm
    
    
    Disassembly of section .text:
    
    00010000 <main>:
       10000:   e3a00005    mov r0, #5
       10004:   e12fff1e    bx  lr
    
    00010008 <_start>:
       10008:   e3a0da11    mov sp, #69632  ; 0x11000
       1000c:   ebfffffb    bl  10000 <main>
       10010:   eafffffe    b   10010 <_start+0x8>
    

    If you run it as a .bin file then you need your C bootstrap code to be at address 0x10000. If you do not specify sections or object names or in some way tell the linker to specifically put something there then the tool goes by what you provide on the command line, and processes those in order. So if the bootstrap code is first on the command line then that entry point will work, but if you put something else first then that is not expected to work at all and ideally crash in some way.

    Now qemu allows for elf files, and it may or may not support the entry point in the elf file and that might happen to work if you specify the entry point in the linker script, but of course when you then take the raw binary image version (-O binary..... .bin) version it will fail on hardware. Unless the code is being loaded by an elf loader or something similar (an operating system a sim environment like this that supports all of that cr@p) then just build the file correctly. (Now understand for cortex-m sims qemu does/did look at the lsbit of the entry to properly start a cortex-m, so you NEED it there).

    arm-none-eabi-nm -a one.elf | grep start
    00010000 T _start
    arm-none-eabi-nm -a two.elf | grep start
    00010008 T _start
    

    You should be able to remove the ENTRY in the above example and have one.bin just work. But two.bin will not. Maybe with the ENTRY() two.elf will work but not really what you should be relying on.

    When building something bare-metal you should always examine the entry point of the code based on the hardware (or sim) to see that you have built the binary correctly before trying to execute it. Any new project or any change in the build infrastructure...examine the toolchain output.

    Note that if you are controlling the linker script then you do not need _start, even if you are not (something-ld -Ttext=0x1000 -Tdata=0x2000) you don't need it, it may give a warning (for the latter) but who cares. _start is defined as an entry point in the stock linker scripts, once you make your own and not use the stock ones, you pick the names of the entry point as desired and other things.

    I find it wasteful because it is trivial to just get the command line right but you will see folks do this:

    flash.s

    .section .init
    
        ldr sp,=0x11000
        bl main
        b .
    
    .section .text
    
    hello:
        b hello
    

    flash.ld

    MEMORY
    {
        ram : ORIGIN = 0x10000, LENGTH = 0x1000
    }
    SECTIONS
    {
        .init   : { *(.init*)   } > ram
        .text   : { *(.text*)   } > ram
        .rodata : { *(.rodata*) } > ram
        .bss    : { *(.bss*)    } > ram
        .data   : { *(.data*)   } > ram
    }
    

    Build is the same:

    one.elf:     file format elf32-littlearm
    
    
    Disassembly of section .init:
    
    00010000 <.init>:
       10000:   e3a0da11    mov sp, #69632  ; 0x11000
       10004:   eb000001    bl  10010 <main>
       10008:   eafffffe    b   10008 <hello-0x4>
    
    Disassembly of section .text:
    
    0001000c <hello>:
       1000c:   eafffffe    b   1000c <hello>
    
    00010010 <main>:
       10010:   e3a00005    mov r0, #5
       10014:   e12fff1e    bx  lr
    
    two.elf:     file format elf32-littlearm
    
    
    Disassembly of section .init:
    
    00010000 <.init>:
       10000:   e3a0da11    mov sp, #69632  ; 0x11000
       10004:   eb000000    bl  1000c <main>
       10008:   eafffffe    b   10008 <main-0x4>
    
    Disassembly of section .text:
    
    0001000c <main>:
       1000c:   e3a00005    mov r0, #5
       10010:   e12fff1e    bx  lr
    
    00010014 <hello>:
       10014:   eafffffe    b   10014 <hello>
    

    You can see that hello and main swap based on the command line (.text) but .init was called out specifically in the linker script before .text.

    I find this an ugly hack, YMMV. An even uglier hack is this:

    flash.s

    ldr sp,=0x11000
    bl main
    b .
    

    flash.ld

    MEMORY
    {
        ram : ORIGIN = 0x10000, LENGTH = 0x1000
    }
    SECTIONS
    {
        .hello  : { flash.o (.text*)  } > ram
        .text   : { *(.text*)   } > ram
        .rodata : { *(.rodata*) } > ram
        .bss    : { *(.bss*)    } > ram
        .data   : { *(.data*)   } > ram
    }
    

    gives:

    one.elf:     file format elf32-littlearm
    
    
    Disassembly of section .hello:
    
    00010000 <.hello>:
       10000:   e3a0da11    mov sp, #69632  ; 0x11000
       10004:   eb000000    bl  1000c <main>
       10008:   eafffffe    b   10008 <main-0x4>
    
    Disassembly of section .text:
    
    0001000c <main>:
       1000c:   e3a00005    mov r0, #5
       10010:   e12fff1e    bx  lr
    
    two.elf:     file format elf32-littlearm
    
    
    Disassembly of section .hello:
    
    00010000 <.hello>:
       10000:   e3a0da11    mov sp, #69632  ; 0x11000
       10004:   eb000000    bl  1000c <main>
       10008:   eafffffe    b   10008 <main-0x4>
    
    Disassembly of section .text:
    
    0001000c <main>:
       1000c:   e3a00005    mov r0, #5
       10010:   e12fff1e    bx  lr
    

    As mentioned from the start: if you specifically call something out in the linker script it changes things otherwise it uses the command line (now there are exceptions to that I have seen). At the end of the day always examine the disassembly when creating a new project or changing the build to see that it is making a binary that will run. (entry point is at the right place if a fixed address, interworking is done right for the hand assembly parts, etc).

    Note that:

    .text   : { *(.text*)   } > ram
    

    The .text name on the left is whatever you want, most folks keep the name as it means something in a conventional way, but you can name these what you want on the left side. The compiler uses .text, .bss, .data or others so you have to get the right side one correct.

    MEMORY
    {
        ram : ORIGIN = 0x10000, LENGTH = 0x1000
    }
    SECTIONS
    {
        .hello  : { flash.o (.text*)  } > ram
        .world  : { *(.text*)   } > ram
    }
    
    Disassembly of section .hello:
    
    00010000 <.hello>:
       10000:   e3a0da11    mov sp, #69632  ; 0x11000
       10004:   eb000000    bl  1000c <main>
       10008:   eafffffe    b   10008 <main-0x4>
    
    Disassembly of section .world:
    
    0001000c <main>:
       1000c:   e3a00005    mov r0, #5
       10010:   e12fff1e    bx  lr
    

    nm and readelf and others are just fine with this. Loader tools like an operating system or maybe qemu with an elf file may or may not want to see .bss, .data, etc...Have to deal with that on a case by case basis. Most folks just use the conventional names.

    Note that the ram name on the memory sections is whatever you want to make it as well you could call it banana instead of ram or rom or flash or ... that you see other folks use.