I am building an emulator of the Risk-V CPU for own educational purposes. I have small POC working and want to build example programs and test them on the emulator.
I'm trying to build example program in Rust and seems like I made some decent progress, but I got stuck when I have to load compiled program to the memory of my emulator and transfer CPU execution to that program.
Test program:
#![no_std]
#![no_main]
use core::panic::PanicInfo;
#[no_mangle]
pub extern "C" fn _start() -> ! {
loop {
for i in 0..1000 {
unsafe {
let r = i as *mut u32;
// This can panic because (500 - i) can be 0
*r = 20000 % (500 - i);
}
}
}
}
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
loop {}
}
build:
$ cargo build --target riscv32i-unknown-none-elf --release
generating binary image from elf target:
riscv32-unknown-linux-gnu-objcopy -g -O binary \
target/riscv32i-unknown-none-elf/release/sample1 \
target/riscv32i-unknown-none-elf/release/sample1.bin
This works fine so far and generates me binary file with size 5156 bytes.
I inspected .bin file and it looks "legit binary" to me.
I found some readable strings in the beginning of the file (like attempt to calculate the remainder with a divisor of zero
) - looks like they are related to code which handles panic which can happen if I'm doing % 0
.
In the end of file I found something that looks like riskv32i instructions (easy to notice them since least significant bits are 11
).
Rest of the file is filled with zeros.
Place where I stuck is I cannot figure out:
_start
). How can I find out which address is the entry point so I can put this address into pc
register before starting CPU cycles? It's obviously not in the beginning of the image (there are human readable strings there).I may went wrong way when I used objcopy
. If that's the case, please let me know what's the appropriate way to load ELF file into a homemade CPU emulator.
Update: Linker arguments, (as provided by RUSTFLAGS="-Z print-link-args" cargo build --target riscv32i-unknown-none-elf --release --verbose
):
rust-lld \
-flavor \
gnu \
-L \
/home/kris/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib \
/mnt/c/src/ws/cpu/sample1/target/riscv32i-unknown-none-elf/release/deps/sample1-4813691a581d1819.sample1.251h7tq6-cgu.0.rcgu.o \
/mnt/c/src/ws/cpu/sample1/target/riscv32i-unknown-none-elf/release/deps/sample1-4813691a581d1819.sample1.251h7tq6-cgu.1.rcgu.o -o \
/mnt/c/src/ws/cpu/sample1/target/riscv32i-unknown-none-elf/release/deps/sample1-4813691a581d1819 \
--gc-sections \
-L \
/mnt/c/src/ws/cpu/sample1/target/riscv32i-unknown-none-elf/release/deps \
-L \
/mnt/c/src/ws/cpu/sample1/target/release/deps \
-L \
/home/kris/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib \
-Bstatic \
/home/kris/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib/librustc_std_workspace_core-6d1cf467df9db3bb.rlib \
/home/kris/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib/libcore-a1a0b4993598bfe4.rlib \
/home/kris/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/riscv32i-unknown-none-elf/lib/libcompiler_builtins-a229bbbccd019775.rlib \
-Bdynamic
I know that there are some important things missing in the program, like initializing stack pointer register. I'm planning to take care about that after I figure out loading logic
Disclaimer: I am not familiar with Rust, but your question is more related to the ELF file format and tools that can understand it - my two cents.
For example, this documentation describes a file memory.x defining the memory map used by the linker:
MEMORY
{
RAM : ORIGIN = 0x80000000, LENGTH = 16K
FLASH : ORIGIN = 0x20000000, LENGTH = 16M
}
REGION_ALIAS("REGION_TEXT", FLASH);
REGION_ALIAS("REGION_RODATA", FLASH);
REGION_ALIAS("REGION_DATA", RAM);
REGION_ALIAS("REGION_BSS", RAM);
REGION_ALIAS("REGION_HEAP", RAM);
REGION_ALIAS("REGION_STACK", RAM);
In this example, the resulting binary should probably be loaded at offset 0x20000000
.
There should be an equivalent with the toolchain you are using.
_start
using a tool that understands the ELF file format.For example, aarch64-none-elf-nm
on one of my executables compiled for Aarch64 will display:
aarch64-none-elf-nm h5-example.elf
0000000042000078 t $d
0000000042000000 t $x
0000000042000080 t $x
00000000420001dc t $x
00000000420001f4 t $x
0000000042000230 B __bss_end__
0000000042000230 B __bss_start__
0000000042000080 T c_entry
000000004200022c D __copy_table_end__
0000000042000220 D __copy_table_start__
0000000042000230 D __data_end__
0000000042000230 D __data_start__
0000000042000230 ? __end__
0000000042000230 B __etext
0000000042000218 T __exidx_end
0000000042000218 T __exidx_start
0000000042000230 d __fini_array_end
0000000042000230 d __fini_array_start
0000000046000230 ? __HeapLimit
0000000004000000 A __HEAP_SIZE
0000000042000230 d __init_array_end
0000000042000230 d __init_array_start
00000000420001f4 T main
0000000042000000 A __RAM_BASE
000000000e000000 A __RAM_SIZE
0000000042000000 T Reset_Handler
0000000000000000 A __ROM_BASE
0000000000000000 A __ROM_SIZE
000000004c000000 ? __StackLimit
0000000004000000 A __STACK_SIZE
0000000050000000 ? __StackTop
00000000420001dc t system_read_CurrentEL
0000000042000230 B __zero_table_end__
0000000042000230 B __zero_table_start__
In my case, the first instruction be be executed would be at Reset_Handler
.
I could retrieve the line referencing it using the following command:
aarch64-none-elf-nm h5-example-02.elf | grep ' Reset_Handler$'
0000000042000000 T Reset_Handler
and its exact address in hexadecimal using:
aarch64-none-elf-nm h5-example-02.elf | grep ' Reset_Handler$' | cut -d ' ' -f1
0000000042000000
RESET_HANDLER=$(aarch64-none-elf-nm h5-example-02.elf | grep ' Reset_Handler$' | cut -d ' ' -f1)
echo ${RESET_HANDLER}
would of course display:
0000000042000000
Now the start address is known, there would be several options for using it in your DIY emulator. The two that came to my mind would be:
a) pass the address as an argument to your emulator, i.e.:
my-emulator 0000000042000000
or my-emulator -s 0000000042000000
b) since you master the format of the image your emulator will load, you could convene to systematically prepend the start address to the binary file produced by objcopy: this way, you would read the first 4 or 8 bytes of the binary file first, getting your start address, then read the remaining bytes.
An easy way to do so would for example to use xxd
and cat
:
echo 0000000042000000 | xxd -r -p > final-image.bin
cat sample1.bin >> final-image.bin
Using an example file containing 'ABCD', we would get:
printf "ABCD" > sample1.bin
hexdump -C sample1.bin
00000000 41 42 43 44 |ABCD|
00000004
echo 0000000042000000 | xxd -r -p > final-image.bin
hexdump -C final-image.bin
00000000 00 00 00 00 42 00 00 00 |....B...|
00000008
cat sample1.bin >> final-image.bin
hexdump -C final-image.bin
00000000 00 00 00 00 42 00 00 00 41 42 43 44 |....B...ABCD|
0000000c
You could of course define a more complicated header, may be containing some other important symbols, or add more command-line options to you emulator - the basic principle would remain the same.
_start()
function into a specific linker section, as described here, using the link_section
directive/pragma:Program:
#[no_mangle]
pub unsafe extern "C" fn Reset() -> ! {
let _x = 42;
// can't return so we go into an infinite loop here
loop {}
}
// The reset vector, a pointer into the reset handler
#[link_section = ".vector_table.reset_vector"]
#[no_mangle]
pub static RESET_VECTOR: unsafe extern "C" fn() -> ! = Reset;
Linker script:
/* Memory layout of the LM3S6965 microcontroller */
/* 1K = 1 KiBi = 1024 bytes */
MEMORY
{
FLASH : ORIGIN = 0x00000000, LENGTH = 256K
RAM : ORIGIN = 0x20000000, LENGTH = 64K
}
/* The entry point is the reset handler */
ENTRY(Reset);
EXTERN(RESET_VECTOR);
SECTIONS
{
.vector_table ORIGIN(FLASH) :
{
/* First entry: initial Stack Pointer value */
LONG(ORIGIN(RAM) + LENGTH(RAM));
/* Second entry: reset vector */
KEEP(*(.vector_table.reset_vector));
} > FLASH
.text :
{
*(.text .text.*);
} > FLASH
/DISCARD/ :
{
*(.ARM.exidx .ARM.exidx.*);
}
}
This way, the code for the _start()
function would always be put at the beginning of the .vector_table
section, which is defined a being the first in the FLASH region.
The address for _start()
would therefore always be 0x00000000
, or whatever address you will decide the reset address will be in your CPU: you would just have to modify the address where the FLASH region is starting from.
The example is related to an Arm Cortex-M MCU, and you could replace the .vector_table
section by, say, your own .startup
section.
I hope I was not off-track on that one...