I am experimenting with RISC-V assembly language on an emulator (qemu64, ubuntu for RISC-V).
Here is a simple program, its function is to convert the instr string to uppercase, outstr is the resulting string.
.global _start
_start:
la x5, outstr
la x6, instr
loop:
lb x7, 0(x6)
addi x6, x6, 1
li x28, 'z'
bgt x7, x28, cont
li x28, 'a'
blt x7, x28, cont
addi x7, x7, ('A'-'a')
cont:
sb x7, 0(x5)
addi x5, x5, 1
li x28, 0
bne x7, x28, loop
li a0, 1
la a1, outstr
sub a2, x5, a1
li a7, 64
ecall
li a0, 0
li a7, 93
ecall
.data
instr: .asciz "String to conVErt xYz.\n"
outstr: .fill 255, 1, 0
For now I am looking at the very two first instructions, where the address of outstr is loaded in x5/t0, and the address of instr in x6/t1
The disassembly for these two instructions, given by GDB, is the following:
0x00000000000100e8 <+0>: addi t0,gp,-2024
0x00000000000100ec <+4>: auipc t1,0x1
0x00000000000100f0 <+8>: addi t1,t1,84 # 0x11140
So according to the first instruction, we expect t0 = (gp-2024)
Let's get the address of the outstr variable:
(gdb) info variables
All defined variables:
Non-debugging symbols:
0x0000000000011140 __DATA_BEGIN__
0x0000000000011140 instr
0x0000000000011158 outstr
0x0000000000011257 __SDATA_BEGIN__
0x0000000000011257 __bss_start
0x0000000000011257 _edata
0x0000000000011258 __BSS_END__
0x0000000000011258 _end
outstr is stored at address 0x11158.
Let's get the value of t0, which is supposed to be the address of outstr:
(gdb) info registers x5
x5 0x55555567c3ac 93824993444780
Something is wrong, what happened ? Let's get the value of gp:
(gdb) info register gp
gp 0x55555567cb94 0x55555567cb94
This value is weird.
As expected, we have t0 = (gp-2024); 0x55555567cb94-2024 = 0x55555567c3ac; the addi instruction returns a correct result.
But t0 is not the address of outstr ! This leads, when trying to access the outstr using the address stored in t0, to a segmentation fault (which makes sens). The issue arises because the gp register is set to an unexpected value, but I don't understand why. Does anyone have a clue ?
Thanks.
EDIT: adding the Makefile
OBJS = chapter5_ToUppercase.o
DEBUGFLAGS = -g
%.o : %.S
as $(DEBUGFLAGS) $< -o $@
chapter5_ToUppercase: $(OBJS)
ld -o chapter5_ToUppercase $(OBJS)
SOLUTION: the issue was due to the global pointer gp being not initialized
To solve that, I had first to edit the linker script and define the initialization value of the register:
.data :
{
__DATA_BEGIN__ = .;
PROVIDE_HIDDEN (__my_gp = . + 0x800);
*(.data .data.* .gnu.linkonce.d.*)
SORT(CONSTRUCTORS)
}
Because the RISCV immediate values are 12 bits signed values (+/- 0x800), we set the gp value to (.data + 0x800)
Actually, at this stage, we defined what will be the init value of gp, but we didn't initialize gp. To do that, we have to tell the RISCV to load gp with the value that we defined in the linker script:
_start:
.option norelax
la gp, __my_gp
.option relax
la x5, outstr
la x6, instr
Note that it is required to disable the norelax option before writing the gp register. It took me an hour to figure out why the la gp, __my_gp instruction was not working...
Thank you everyone for your help.