cgccx86gnu-assemblertdm-mingw

GCC's assembly output of an empty program on x86, win32


I write empty programs to annoy the hell out of stackoverflow coders, NOT. I am just exploring the gnu toolchain.

Now the following might be too deep for me, but to continuie the empty program saga I have started to examine the output of the C compiler, the stuff GNU as consumes.

gcc version 4.4.0 (TDM-1 mingw32)

test.c:

int main()
{
    return 0;
}

gcc -S test.c

    .file   "test.c"
    .def    ___main;    .scl    2;  .type   32; .endef
    .text
.globl _main
    .def    _main;  .scl    2;  .type   32; .endef
_main:
    pushl   %ebp
    movl    %esp, %ebp
    andl    $-16, %esp
    call    ___main
    movl    $0, %eax
    leave
    ret 

Can you explain what happens here? Here is my effort to understand it. I have used the as manual and my minimal x86 ASM knowledge:

Thank you for your help!


Solution

  • .file "test.c"

    Commands starting with . are directives to the assembler. This just says this is "file.c", that information can be exported to the debugging information of the exe.

    .def ___main; .scl 2; .type 32; .endef

    .def directives defines a debugging symbol. scl 2 means storage class 2(external storage class) .type 32 says this sumbol is a function. These numbers will be defined by the pe-coff exe-format

    ___main is a function called that takes care of bootstrapping that gcc needs(it'll do things like run c++ static initializers and other housekeeping needed).

    .text
    

    Begins a text section - code lives here.

    .globl _main

    defines the _main symbol as global, which will make it visible to the linker and to other modules that's linked in.

    .def        _main;  .scl    2;      .type   32;     .endef
    

    Same thing as _main , creates debugging symbols stating that _main is a function. This can be used by debuggers.

    _main:

    Starts a new label(It'll end up an address). the .globl directive above makes this address visible to other entities.

    pushl       %ebp
    

    Saves the old frame pointer(ebp register) on the stack (so it can be put back in place when this function ends)

    movl        %esp, %ebp
    

    Moves the stack pointer to the ebp register. ebp is often called the frame pointer, it points at the top of the stack values within the current "frame"(function usually), (referring to variables on the stack via ebp can help debuggers)

    andl $-16, %esp

    Ands the stack with fffffff0 which effectivly aligns it on a 16 byte boundary. Access to aligned values on the stack are much faster than if they were unaligned. All these preceding instructions are pretty much a standard function prologue.

    call        ___main
    

    Calls the ___main function which will do initializing stuff that gcc needs. Call will push the current instruction pointer on the stack and jump to the address of ___main

    movl        $0, %eax
    

    move 0 to the eax register,(the 0 in return 0;) the eax register is used to hold function return values for the stdcall calling convention.

    leave

    The leave instruction is pretty much shorthand for

    movl     ebp,esp
    popl     ebp
    

    i.e. it "undos" the stuff done at the start of the function - restoring the frame pointer and stack to its former state.

    ret

    Returns to whoever called this function. It'll pop the instruction pointer from the stack (which a corresponding call instruction will have placed there) and jump there.