cassemblygccx86-64compiler-optimization

How is this string in an array represented in assembly when a C program is compiled using the gcc -S option?


This is a C program, which has been compiled to assembly using gcc -S. How is string "Hello, world" represented in this program?

This is the C-code:

1.        #include <stdio.h>
2.        
3.        int main(void) {
4.        
5.            char st[] = "Hello, wolrd";
6.            printf("%s\n", st);
7.
8.            return 0;
9.       }

Heres the assembly code:

1.        .intel_syntax noprefix
2.        .text
3.        .globl  main
4.
5. main:
6.         push    rbp
7.         mov     rbp, rsp
8.         sub     rsp, 32
9.         mov     rax, QWORD PTR fs:40
10         mov     QWORD PTR [rbp-8], rax
11.        xor     eax, eax
12.        movabs  rax, 8583909746840200520
15.        mov     QWORD PTR [rbp-32], rax
14.        mov     DWORD PTR [rbp-24], 1684828783
15.        mov     BYTE PTR [rbp-20], 0
16.        lea     rax, [rbp-32]
17.        mov     rdi, rax
18.        call    puts
19.        mov     eax, 0
20.        mov     rdx, QWORD PTR [rbp-8]
21.        xor     rdx, QWORD PTR fs:40
22        je      .L3
22.        call    __stack_chk_fail
23.  .L3:
24.        leave
25.        ret

Solution

  • You are using a local buffer in function main, initialized from a string literal. The compiler compiles this initialization as setting the 16 bytes at [rbp-32] with 3 mov instructions. The first one via rax, the second immediate as the value is 32 bits, the third for a single byte.

    8583909746840200520 in decimal is 0x77202c6f6c6c6548 in hex, corresponding to the bytes "Hello, W" in little endian order, 1684828783 is 0x646c726f, the bytes "orld". The third mov sets the final '\0' byte. Hence the buffer contains "Hello, World".

    This string is then passed to puts for output to stdout.

    Note that gcc optimized the call printf("%s\n", "Hello, World"); to puts("Hello, World");! By the way, clang performs the same optimization.