assemblyx86gnu-assemblerintel-syntax

why sometimes use offset flat:label and sometimes not


I'm learning assembly using GNU Assembler using the Intel syntax and there is one think that I don't really understand. Say for example this code right here :

.intel_syntax noprefix
.data
string: .asciz "hello world"

.text
.global entry
.type entry, @function

entry:
    mov byte ptr[string + 4], 'a'
    mov eax, offset flat:string
    ret

I get the idea to use offset flat: as we are writing relocatable code. But why don't we also specify offset flat:string at his line : mov byte ptr[string + 4], 'a' as we are doing over here mov eax, offset flat:string ?

I'm really confused. If doing mov byte ptr[string + 4], 'a' works to get the address of the string label + 4 then why doing mov eax, string isn't the same ?

Edit :

To clarify, After calling entry, I use printf to print what's in EAX as follow :

#include <stdio.h>

extern char *entry(void);

int main(int argc, char*argv[])
{
    printf("%s", entry());

}

Solution

  • You always need OFFSET when you want a symbol address as an immediate, like AT&T syntax $string instead of string. You never need it any other time.


    Basically it comes down to the fact that in GAS Intel syntax (like AT&T movb $'a', string+4), string is a memory operand even without [], so it needs extra syntax to ask for the address instead of memory at that address.

    When using string as part of [string + 4], you're not asking for the offset, you're addressing memory at that label/symbol address. Using it as part of an addressing mode.

    If you'd rather use a better-designed syntax where mov eax, string+4 does give you the address (without dereferencing it), use NASM.

    Otherwise see Confusing brackets in MASM32 (GAS's Intel syntax is like MASM in most ways, except that mov eax, [12] is a load from that absolute address, not MASM's insanity of having that be equivalent to mov eax, 12),

    And somewhat related: Distinguishing memory from constant in GNU as .intel_syntax about how GAS parses constants, but that's more about .equ foo, 4 / foo = 4 appearing before vs. after the instruction referencing it, if you use mov eax, foo instead of something unambiguous like mov eax, [foo] or mov eax, OFFSET foo

    Also: