[bits 32]
global _start
section .data
str_hello db "HelloWorld", 0xa
str_hello_length db $-str_hello
section .text
_start:
mov ebx, 1 ; stdout file descriptor
mov ecx, str_hello ; pointer to string of characters that will be displayed
mov edx, [str_hello_length] ; count outputs Relative addressing
mov eax, 4 ; sys_write
int 0x80 ; linux kernel system call
mov ebx, 0 ; exit status zero
mov eax, 1 ; sys_exit
int 0x80 ; linux kernel system call
The fundamental thing here is that I need to have the length of the hello string to pass to linux's sys_write system call. Now, I'm well aware that I can just use EQU and it'll work fine, but I'm really trying to understand what's going on here.
So, basically when I use EQU it loads the value and that's fine.
str_hello_length equ $-str_hello
...
...
mov edx, str_hello_length
However, if I use this line with DB
str_hello_length db $-str_hello
...
...
mov edx, [str_hello_length] ; of course, without the brackets it'll load the address, which I don't want. I want the value stored at that address
instead of loading the value at that address like I expect it to, the assembler outputs RIP-Relative Addressing, as shown in the gdb debugger and I'm simply just wondering why.
mov 0x6000e5(%rip),%edx # 0xa001a5
Now, I've tried using the eax register instead(and then moving eax to edx), but then I get a different problem. I end up getting a segmentation fault as noted in gdb:
movabs 0x4b8c289006000e5,%eax
so apparently, different registers produce different code. I guess I need to truncate the upper 32-bits somehow , but I don't know how to do that.
Though did kind of found a 'solution' and it goes like this: load eax with str_hello_length's address and then load the contents of address that eax points to and everything is hunky dory.
mov eax, str_hello_length
mov edx, [eax] ; count
; gdb disassembly
mov $0x6000e5,%eax
mov (%rax),%edx
apparently trying to indirectly load a value from a mem address produces different code? I don't really know.
I just need help in understanding the syntax and operations of these instructions, so I can better understand why how to load effective addresses. Yeah, I guess I could've just switched to EQU and be on my merry way, but I really feel I can't go on until I understand what's going on with the DB declaration and loading from it's address.
The answer is it isn't. x86-64 doesn't have RIP-relative addressing in 32-bit emulation mode (this should be obvious because RIP doesn't exist in 32-bit). What's happening is that nasm is compiling you some lovely 32-bit opcodes that you're trying to run as 64-bit. GDB is disassembling your 32-bit opcodes as 64-bit, and telling you that in 64-bit, those bytes mean a RIP-relative mov. 64-bit and 32-bit opcodes on the x86-64 overlap a lot to make use of common decoding logic in the silicon, and you're getting confused because the code that GDB is disassembling looks similar to the 32-bit code you wrote, but in reality you're just throwing garbage bytes at the processor.
This isn't anything to do with nasm. You're using the wrong architecture for the process you're in. Either use 32-bit nasm in a 32-bit process or compile your assembly code for [BITS 64].