I am writing some x64 assembly for the GNU assembler. I've been trying to read about the .seh_* directives, but I'm not finding much information about them. The gas
docs don't mention them at all.
But as I understand it, if my code might be in the stack during an SEH unwind operation, I am expected to use these. And since my code does stack manipulations and calls other functions, SEH is a possibility, so I should be using these.
Mostly I think I've got it right:
.seh_proc FCT
FCT:
push %rbp
.seh_pushreg %rbp
mov %rsp, %rbp
.seh_setframe %rbp, 0
push %r14
.seh_pushreg %r14
lea -(iOffset + iBytes)(%rsp), %rsp
.seh_stackalloc iOffset + iBytes
andq $-16, %rsp <---- But what about this?
.seh_endprologue
etc...
But there's one bit that's not clear. I've got this instruction:
andq $-16, %rsp
How on earth do I tell SEH that I'm performing stack alignment? This might adjust the stack by anywhere from 15 bytes (very unlikely) to 8 bytes (very likely), to 0 bytes (certainly possible). Since the actual amount may not be determined until runtime, I'm stuck.
I suppose I can skip the .seh instruction, but if 8 bytes of stack do get reserved there, I've probably trashed the unwind, haven't I? Doesn't that defeat the entire purpose here?
Alternately I can omit the alignment. But if I call other functions (say memcpy), aren't I supposed to align the stack? According to MS:
The stack will always be maintained 16-byte aligned, except within the prolog
Maybe I can 'reason' my way thru this? If the guy that called me did things right (if...), then the stack was aligned when he did the call
, so now I'm off by 8 bytes (the return address) plus whatever I do in my prolog. Can I depend on this? Seems fragile.
I've tried looking at other code, but I'm not sure I trust what I'm seeing. I doubt gas
reports errors from misusing .seh_*. You would probably only ever see a problem during an actual exception (and maybe not always even then).
If I'm going to do this, I'd like to do it right. It seems like stack alignment would be a common thing, so someone must have a solution here. I'm just not seeing it.
Looking at some code output by gcc, I think I know the answer. I was on the right track with my 'reason' approach.
When a function is called, the stack temporarily becomes unaligned (due to the call
), but is almost immediately re-aligned via pushq %rbp
. After that, adjustments to the stack (for local variables or stack space for parameters to called functions, etc) are always made using multiples of 16. So by the end of the prolog, the stack is always properly aligned again, and stays that way until the next call
.
Which means that while andq $-16, %rsp
can be used to align the stack, I shouldn't need to if I write my prolog correctly.
CAVEAT: Leaf functions (ie functions that don't call other functions) do not need to align the stack (https://msdn.microsoft.com/en-us/library/67fa79wz.aspx).