How and why does my program change its input buffer? Using GDB to find out where. (Converting string to int in NASM x86 32bit)

%macro mov_dd 2
    push eax
    push ebx
    mov dword eax, [%1]
    mov ebx, [eax]
    mov dword [%2], ebx
    pop ebx
    pop eax
%endmacro

section .data
    text db "Enter first Number: "
    len equ $-text

section .bss
    input resb 4
    ten_exp_num resb 4
    input_ptr resb 4



section .text
    global _start

_start:
    mov eax, 4
    mov ebx, 1
    mov ecx, text
    mov edx, len
    int 0x80            ; print(text)

    mov eax, 3
    xor ebx, ebx
    mov ecx, input
    mov edx, 16
    int 0x80            ; input = input()

    xor edi, edi
    xor eax, eax

    mov dword [input_ptr], input

    push ebp
    mov ebp, esp

    mov byte [ten_exp_num], 1

    call convert
    pop ebp

    mov [input], eax
    mov ecx, input
    mov eax, 4
    mov ebx, 1
    mov edx, 1
    int 0x80

    jmp end

convert:
    mov_dd input_ptr, input
    mov bl, [input]

    cmp bl, 10          ; 10 ≙ '\n'
    je return_func

    call to_int

    inc dword [input_ptr]

    jmp convert

to_int:
    sub bl, "0"
    
    movzx edi, bl

    imul edi, [ten_exp_num]

    add eax, edi
    call ten_expo
    ret

ten_expo:
    push eax
    
    mov eax, [ten_exp_num]
    imul eax, 10
    mov [ten_exp_num], eax

    pop eax
    ret

return_func:
    ret

end:
    mov eax, 1
    xor ebx, ebx
    int 0x80            ; return 0

I'm really new to assembly programming and currently trying to program a calculator, to get a basic understanding of it.
When im debugging with gdb and use e.g. 12\n as input, it works all fine until I'm in my convert loop at the third char (after the macro), which should be \n, but actually is just 0x00. I have completely no clue why that happens and already some stuff like changing the

inc dword [input_ptr]

to
inc byte [input_ptr]

but it didnt seem to help.

Can someone tell my why that happens and how to fix it?
(I know my code would convert in wrong order but I don't want to fix it before this works)

Solution

I ran your code under GDB and set a watchpoint (watch (char[4])input) to detect which instruction changes your input: resb 4 buffer, for the same input you used, 12 enter

First the read system-call as expected, but then mov DWORD PTR ds:0x804a014,ebx changes it, from "12\n\0" to "2\n\0\n" That's from mov dword [%2], ebx in the macro expansion of mov_dd input_ptr, input.

IDK what the point of mov_dd input_ptr, input was supposed to be, but there's the culprit. I think you're loading 4 bytes pointed-to by the current input pointer and storing them back into the input buffer. (So you read past the end of your input buffer, into bytes that are part of your pointer, and are copying those bytes into input)

Update: you're always reading from the start of input with mov bl, [input], and were trying to shift the whole input over instead of just reading the byte pointed-to by input_ptr.

If you keep stepping, you'll see that the instead of shifting by 1 each time, you're shifting by an increasing amount as input_ptr gets farther into the buffer, farther away from the read position at [input + 0]. The second shift starts with

2 \n \0 \n   [bytes of ten_exp_num resb 4  where the 0xa = 1*10 came from]
     ^
     |
   input_ptr (after having been incremented twice)

Loading 4 bytes from there, the first byte is 0, which you then store over input. One way to fix this would be to only load a byte and copy it to the first byte of input, but then you might as well just have used that byte you loaded.

Or to keep this extremely clunky byte-shifting logic, keep input_ptr = input+1 the whole time so you do 4-byte loads and stores that overlap by 3 bytes, after loading the first byte.
The sane way to shift 4 bytes in memory would be shr dword [input], 8 or ror dword [input], 8, which avoids loading from past the end of the buffer. But neither of these scale easily to input buffers longer than 4 bytes. UINT32_MAX is 4294967295 which is 10 digits, so 11 bytes long including a newline, assuming the user submits input by pressing enter instead of ctrl-D on the terminal, or redirecting from a file or a pipe. (This is just a toy program so it's fine to make assumptions about the input being only digits followed by a newline in a toy program, ignoring the return value of the read system call.)

You're already incrementing a pointer, just movzx edx, byte [esi] or something to the byte pointed-to by that pointer.

Normally you'd want to keep pointers and integers in registers, only using memory for arbitrary-length stuff like strings. For example Convert string to int. x86 32 bit Assembler using Nasm uses the usual total = total*10 + digit-'0' algorithm, stopping on the first byte that isn't in the '0' .. '9' range. See also NASM Assembly convert input to integer? for more explanation about that algorithm and a more efficient check for being a digit. (Since you already want to sub digit, '0' for later use, do that first and check if the result is unsigned 0..9, which only takes one cmp/jna).

Pointers are 32-bit, you definitely want dword operand-size to increment whether that's on a pointer in memory or much more simply inc esi on a pointer in a register. byte operand-size would only operate on the low 8 bits, so would wrap around the low byte without propagating carry into the high bits of the pointer. e.g. would loop over a 256-byte aligned chunk of memory, e.g. going from 0x4000ff to 0x400000 instead of 0x400100 for a normal pointer increment. Equivalent if input: resb 4 is aligned by 4, which it should be at the start of the BSS, but wrong in general. (And slower: store-forwarding stall when you reload a dword after a byte store.)