i have a simple c function in start.c
$ cat start.c
int main(int,char**);
void _start(){
char*v[2]={"k",0};
main(1,v);
}
when i compile to assembler, with clang -O -march=cannonlake -S start.c
i get start.s which has this code:
_start: # @_start
.cfi_startproc
# %bb.0:
subq $24, %rsp
.cfi_def_cfa_offset 32
vmovaps .L__const._start.v(%rip), %xmm0
vmovaps %xmm0, (%rsp)
when i run this code (in the bochs emulator with a bespoke os), i get an exception because vmovaps %xmm0, (%rsp)
happens with an rsp that is not 16 byte aligned. the alignment of the stack is 16 byte when _start is called and subq $24, %rsp
is going to change that alignment.
i've tried clang 13 and 17 with very similar results. is clang in error, or am i seeing it wrong?
when i add a asm volatile ("subq $8,%%rsp"::);
to the beginning of _start, it fixes the problem.
this has similarities to main and stack alignment
You defined _start()
as a C function, so Clang compiles code to work when called with RSP % 16 == 8
like the ABI guarantees.
But then you linked it into an executable where it's actually the process entry point, not a function. It has no return address. There is no call
that pushes a return address; as you say, RSP % 16 == 0
on entry to _start
. This is essentially undefined behaviour, in the sense that the compiler can assume a normal function won't be called (or otherwise jumped to) with a misaligned stack.
At least with GCC, you can use __attribute__((force_align_arg_pointer))
to tell it the incoming alignment is less than normal.
Or use command line options that change the ABI, like gcc -mpreferred-stack-boundary=3
(1<<3 == 8
instead of the default 4
for 16-byte alignment.)
See How to get arguments value in _start and call main() using inline assembly in C, without Glibc or CRT start files? for a very hacky but working _start
for the x86-64 System V ABI written in GNU C without any inline asm but still getting argc
and argv
from the stack (as the return address and first stack arg). And getting RSP correctly aligned for future call
s. I haven't tested it with clang, I don't know if it supports the same attributes.