[SOLVED] Writing a putchar in Assembly for x86

Writing a putchar in Assembly for x86_64 with 64 bit Linux?

I am trying to use the write syscall in order to reproduce the putchar function behavior which prints a single character. My code is as follows,

asm_putchar:
  push    rbp
  mov     rbp, rsp

  mov     r8, rdi

call:
  mov     rax, 1
  mov     rdi, 1
  mov     rsi, r8
  mov     rdx, 1
  syscall

return:
  mov     rsp, rbp
  pop     rbp
  ret

Solution

From man 2 write, you can see the signature of write is,

ssize_t write(int fd, const void *buf, size_t count);

It takes a pointer (const void *buf) to a buffer in memory. You can't pass it a char by value, so you have to store it to memory and pass a pointer.

(Don't write one char at a time unless you only have one to print, that's really inefficient, which is why C stdio buffers I/O normally. Construct a buffer in memory and print that: e.g. this x86-64 Linux NASM function: How do I print an integer in Assembly Level Programming without printf from the c library? (itoa, integer to decimal ASCII string))

A NASM version of GCC: putchar(char) in inline assembly, tweaked a bit for code-size / efficiency.

; x86-64 System V calling convention: input = byte in DIL
; clobbers: RDI, RSI, RDX,  RCX, R11 (last 2 by syscall itself)
; returns:  RAX = write return value: 1 for success, -1..-4095 for error
writechar:
    lea     rsi, [rsp-4]          ; RSI = buf in the red zone (below RSP)
    mov    [rsi], edi             ; store the char from RDI into it

    mov     eax, 1                ; __NR_write syscall number from unistd_64.h
    mov     edi, 1                ; EDI = fd=1 = stdout
    ; RSI = buf set earlier, before overwriting the char in EDI
    mov     edx, eax              ; RDX = len = 1  happens to be the same as fd and call #
    syscall                    ; rax = write(1, buf, 1)
    ret

We actually only need a 1 byte store, like mov [rsp-1], dil, but a 4-byte store saves a byte of code-size. And int putchar(int) means the caller should have written a full register, so we aren't going to get a partial-register stall even on old CPUs.

If you do pass an invalid pointer in RSI, such as '2' (integer 50), the system call will return -EFAULT (-14) in RAX. (The kernel returns error codes on bad pointers to system calls, instead of delivering a SIGSEGV like it would if you deref in user-space).

Instead of writing code to check return values, in toy programs / experiments you should just run them under strace ./a.out. If you're writing your own _start without libc there won't be any other system calls during startup that you don't make yourself, so it's very easy to read the output, otherwise there are a bunch of startup system calls made by libc before your code. How should strace be used?