assemblynull

What does the null character do in `db` command?


I'm new to assembly coding and I have a question about the null character in .data section.

I tested a few codes:

Code 1:

section .data
        out: db "%s",10,0
        mes1: db "a",0
        mes2: db "b",0
section .text
        extern printf
        global main
main:
        push rbp
        mov rdi,out
        mov rsi,mes1
        mov rax,0
        call printf
        mov rdi,out
        mov rsi,mes2
        mov rax,0
        call printf
        pop rbp
        mov rax,0
        ret

Output is:

a
b

Code 2: changed the .data section to:

section .data
        out: db "%s",10 ; no 0
        mes1: db "a",0
        mes2: db "b",0

Output is:

a
ab
a

Code 3: changed the .data section to:

section .data
        out: db "%s",10,0
        mes1: db "a"
        mes2: db "b"

Output is:

ab
b

So what does the null character do?

I tried to debug it in pwndbg but I didn't get anything interesting.


Solution

  • So what does the null character do?

    It informs the service (eg. printf) about where the end of the string is. But that alone does not explain the different results that you got. A second element to consider is how those strings out, mes1, and mes2 are stored in the memory. It's important to note that they get stored contiguously and that the memory behind the last item contains almost certainly one or more null-bytes.

    Code 1:

    out:  db "%s",10,0
    mes1: db "a",0
    mes2: db "b",0
                                     null from zero-initialized .data section
                                     v
    "%", "s", 10, 0, "a", 0, "b", 0, 0, ...
    <---- out ---->
                     <mes1>
                             <mes2>
    
    a
    b
    

    Code 2:

    out:  db "%s",10
    mes1: db "a",0
    mes2: db "b",0
                                  null from zero-initialized .data section
                                  v
    "%", "s", 10, "a", 0, "b", 0, 0, ...
    <------- out ------>
                  <mes1>
                          <mes2>
    

    The format string now includes an extra fixed char 'a' behind the newline code.
    Output:

    a
    ab
    a
    

    Code 3:

    out:  db "%s",10,0
    mes1: db "a"
    mes2: db "b"
                               null from zero-initialized .data section
                               v
    "%", "s", 10, 0, "a", "b", 0, ...
    <---- out ---->
                     <-- mes1 ->
                          <mes2>
    

    The first message got longer by one character and both messages got zero-terminated thanks to the zero-initialization of the .data section.
    Output:

    ab
    b