cclangx86-64memory-alignmentabi

clang misaligns stack and then tries a vmovaps in _start written as a C function


i have a simple c function in start.c

$ cat start.c
int main(int,char**);
void _start(){
 char*v[2]={"k",0};
 main(1,v);
}

when i compile to assembler, with clang -O -march=cannonlake -S start.c i get start.s which has this code:

_start:                                 # @_start
    .cfi_startproc
# %bb.0:
    subq    $24, %rsp
    .cfi_def_cfa_offset 32
    vmovaps .L__const._start.v(%rip), %xmm0
    vmovaps %xmm0, (%rsp)

when i run this code (in the bochs emulator with a bespoke os), i get an exception because vmovaps %xmm0, (%rsp) happens with an rsp that is not 16 byte aligned. the alignment of the stack is 16 byte when _start is called and subq $24, %rsp is going to change that alignment.

i've tried clang 13 and 17 with very similar results. is clang in error, or am i seeing it wrong?

when i add a asm volatile ("subq $8,%%rsp"::); to the beginning of _start, it fixes the problem.

this has similarities to main and stack alignment


Solution

  • You defined _start() as a C function, so Clang compiles code to work when called with RSP % 16 == 8 like the ABI guarantees.

    But then you linked it into an executable where it's actually the process entry point, not a function. It has no return address. There is no call that pushes a return address; as you say, RSP % 16 == 0 on entry to _start. This is essentially undefined behaviour, in the sense that the compiler can assume a normal function won't be called (or otherwise jumped to) with a misaligned stack.

    At least with GCC, you can use __attribute__((force_align_arg_pointer)) to tell it the incoming alignment is less than normal.

    Or use command line options that change the ABI, like gcc -mpreferred-stack-boundary=3 (1<<3 == 8 instead of the default 4 for 16-byte alignment.)

    See How to get arguments value in _start and call main() using inline assembly in C, without Glibc or CRT start files? for a very hacky but working _start for the x86-64 System V ABI written in GNU C without any inline asm but still getting argc and argv from the stack (as the return address and first stack arg). And getting RSP correctly aligned for future calls. I haven't tested it with clang, I don't know if it supports the same attributes.