rustembeddedriscvobjcopy

How to get an entry point address in binary image file generated by objcopy?


I am building an emulator of the Risk-V CPU for own educational purposes. I have small POC working and want to build example programs and test them on the emulator.

I'm trying to build example program in Rust and seems like I made some decent progress, but I got stuck when I have to load compiled program to the memory of my emulator and transfer CPU execution to that program.

Test program:

#![no_std]
#![no_main]

use core::panic::PanicInfo;

#[no_mangle]
pub extern "C" fn _start() -> ! {
    loop {
        for i in 0..1000 {
            unsafe {
                let r = i as *mut u32;
                // This can panic because (500 - i) can be 0
                *r = 20000 % (500 - i);
            }
        }
    }
}

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {}
}

build:

$ cargo build --target riscv32i-unknown-none-elf --release

generating binary image from elf target:

riscv32-unknown-linux-gnu-objcopy -g -O binary \
  target/riscv32i-unknown-none-elf/release/sample1 \
  target/riscv32i-unknown-none-elf/release/sample1.bin

This works fine so far and generates me binary file with size 5156 bytes.

I inspected .bin file and it looks "legit binary" to me. I found some readable strings in the beginning of the file (like attempt to calculate the remainder with a divisor of zero) - looks like they are related to code which handles panic which can happen if I'm doing % 0. In the end of file I found something that looks like riskv32i instructions (easy to notice them since least significant bits are 11). Rest of the file is filled with zeros.

Place where I stuck is I cannot figure out:

  1. At which offset am I supposed to load this bin image file into memory of my virtual CPU? I don't think it's OK to load it at 0x0 address, because there is useful info in the beginning of the image and I don't think it's cool for the program to read it from address 0x0.
  2. After program is loaded, I need to transfer CPU execution to the entry point of my program (_start). How can I find out which address is the entry point so I can put this address into pc register before starting CPU cycles? It's obviously not in the beginning of the image (there are human readable strings there).
  3. Is there a way to make this entry point address stable, so all programs which I write will have the same entry point address, so I don't have to do tweaks for each of programs I compile?

I may went wrong way when I used objcopy. If that's the case, please let me know what's the appropriate way to load ELF file into a homemade CPU emulator.

Update: Linker arguments, (as provided by RUSTFLAGS="-Z print-link-args" cargo build --target riscv32i-unknown-none-elf --release --verbose):

rust-lld \
    -flavor \
    gnu \
    -L \
    /home/kris/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib \
    /mnt/c/src/ws/cpu/sample1/target/riscv32i-unknown-none-elf/release/deps/sample1-4813691a581d1819.sample1.251h7tq6-cgu.0.rcgu.o \
    /mnt/c/src/ws/cpu/sample1/target/riscv32i-unknown-none-elf/release/deps/sample1-4813691a581d1819.sample1.251h7tq6-cgu.1.rcgu.o -o \
    /mnt/c/src/ws/cpu/sample1/target/riscv32i-unknown-none-elf/release/deps/sample1-4813691a581d1819 \
    --gc-sections \
    -L \
    /mnt/c/src/ws/cpu/sample1/target/riscv32i-unknown-none-elf/release/deps \
    -L \
    /mnt/c/src/ws/cpu/sample1/target/release/deps \
    -L \
    /home/kris/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib \
    -Bstatic \
    /home/kris/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib/librustc_std_workspace_core-6d1cf467df9db3bb.rlib \
    /home/kris/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib/libcore-a1a0b4993598bfe4.rlib \
    /home/kris/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib/libcompiler_builtins-a229bbbccd019775.rlib \
    -Bdynamic

I know that there are some important things missing in the program, like initializing stack pointer register. I'm planning to take care about that after I figure out loading logic


Solution

  • Disclaimer: I am not familiar with Rust, but your question is more related to the ELF file format and tools that can understand it - my two cents.

    1. Which offset you should load your binary file at should probably be guided by the linker settings rust-ldd is using.

    For example, this documentation describes a file memory.x defining the memory map used by the linker:

    MEMORY
    {
      RAM : ORIGIN = 0x80000000, LENGTH = 16K
      FLASH : ORIGIN = 0x20000000, LENGTH = 16M
    }
    
    REGION_ALIAS("REGION_TEXT", FLASH);
    REGION_ALIAS("REGION_RODATA", FLASH);
    REGION_ALIAS("REGION_DATA", RAM);
    REGION_ALIAS("REGION_BSS", RAM);
    REGION_ALIAS("REGION_HEAP", RAM);
    REGION_ALIAS("REGION_STACK", RAM);
    

    In this example, the resulting binary should probably be loaded at offset 0x20000000.

    There should be an equivalent with the toolchain you are using.

    1. You can find _start using a tool that understands the ELF file format.

    For example, aarch64-none-elf-nm on one of my executables compiled for Aarch64 will display:

    aarch64-none-elf-nm h5-example.elf
    0000000042000078 t $d
    0000000042000000 t $x
    0000000042000080 t $x
    00000000420001dc t $x
    00000000420001f4 t $x
    0000000042000230 B __bss_end__
    0000000042000230 B __bss_start__
    0000000042000080 T c_entry
    000000004200022c D __copy_table_end__
    0000000042000220 D __copy_table_start__
    0000000042000230 D __data_end__
    0000000042000230 D __data_start__
    0000000042000230 ? __end__
    0000000042000230 B __etext
    0000000042000218 T __exidx_end
    0000000042000218 T __exidx_start
    0000000042000230 d __fini_array_end
    0000000042000230 d __fini_array_start
    0000000046000230 ? __HeapLimit
    0000000004000000 A __HEAP_SIZE
    0000000042000230 d __init_array_end
    0000000042000230 d __init_array_start
    00000000420001f4 T main
    0000000042000000 A __RAM_BASE
    000000000e000000 A __RAM_SIZE
    0000000042000000 T Reset_Handler
    0000000000000000 A __ROM_BASE
    0000000000000000 A __ROM_SIZE
    000000004c000000 ? __StackLimit
    0000000004000000 A __STACK_SIZE
    0000000050000000 ? __StackTop
    00000000420001dc t system_read_CurrentEL
    0000000042000230 B __zero_table_end__
    0000000042000230 B __zero_table_start__
    

    In my case, the first instruction be be executed would be at Reset_Handler. I could retrieve the line referencing it using the following command:

    aarch64-none-elf-nm h5-example-02.elf | grep ' Reset_Handler$'
    0000000042000000 T Reset_Handler
    

    and its exact address in hexadecimal using:

    aarch64-none-elf-nm h5-example-02.elf | grep ' Reset_Handler$' | cut -d ' ' -f1
    0000000042000000
    
    RESET_HANDLER=$(aarch64-none-elf-nm h5-example-02.elf | grep ' Reset_Handler$' | cut -d ' ' -f1)
    echo ${RESET_HANDLER}
    

    would of course display:

    0000000042000000
    

    Now the start address is known, there would be several options for using it in your DIY emulator. The two that came to my mind would be:

    a) pass the address as an argument to your emulator, i.e.:

    my-emulator 0000000042000000 or my-emulator -s 0000000042000000

    b) since you master the format of the image your emulator will load, you could convene to systematically prepend the start address to the binary file produced by objcopy: this way, you would read the first 4 or 8 bytes of the binary file first, getting your start address, then read the remaining bytes.

    An easy way to do so would for example to use xxd and cat:

    echo 0000000042000000 | xxd -r -p > final-image.bin
    cat sample1.bin >> final-image.bin
    

    Using an example file containing 'ABCD', we would get:

    printf "ABCD" > sample1.bin
    hexdump -C sample1.bin
    00000000  41 42 43 44                                       |ABCD|
    00000004
    
    echo 0000000042000000 | xxd -r -p > final-image.bin
    hexdump -C final-image.bin
    
    00000000  00 00 00 00 42 00 00 00                           |....B...|
    00000008
    
    cat sample1.bin >> final-image.bin
    hexdump -C final-image.bin
    00000000  00 00 00 00 42 00 00 00  41 42 43 44              |....B...ABCD|
    0000000c
    

    You could of course define a more complicated header, may be containing some other important symbols, or add more command-line options to you emulator - the basic principle would remain the same.

    1. Yes, you could probably force your compiler to put the _start() function into a specific linker section, as described here, using the link_section directive/pragma:

    Program:

    #[no_mangle]
    pub unsafe extern "C" fn Reset() -> ! {
        let _x = 42;
    
        // can't return so we go into an infinite loop here
        loop {}
    }
    
    // The reset vector, a pointer into the reset handler
    #[link_section = ".vector_table.reset_vector"]
    #[no_mangle]
    pub static RESET_VECTOR: unsafe extern "C" fn() -> ! = Reset;
    

    Linker script:

    /* Memory layout of the LM3S6965 microcontroller */
    /* 1K = 1 KiBi = 1024 bytes */
    MEMORY
    {
      FLASH : ORIGIN = 0x00000000, LENGTH = 256K
      RAM : ORIGIN = 0x20000000, LENGTH = 64K
    }
    
    /* The entry point is the reset handler */
    ENTRY(Reset);
    
    EXTERN(RESET_VECTOR);
    
    SECTIONS
    {
      .vector_table ORIGIN(FLASH) :
      {
        /* First entry: initial Stack Pointer value */
        LONG(ORIGIN(RAM) + LENGTH(RAM));
    
        /* Second entry: reset vector */
        KEEP(*(.vector_table.reset_vector));
      } > FLASH
    
      .text :
      {
        *(.text .text.*);
      } > FLASH
    
      /DISCARD/ :
      {
        *(.ARM.exidx .ARM.exidx.*);
      }
    }
    

    This way, the code for the _start() function would always be put at the beginning of the .vector_table section, which is defined a being the first in the FLASH region.

    The address for _start() would therefore always be 0x00000000, or whatever address you will decide the reset address will be in your CPU: you would just have to modify the address where the FLASH region is starting from.

    The example is related to an Arm Cortex-M MCU, and you could replace the .vector_table section by, say, your own .startup section.

    I hope I was not off-track on that one...