Because of a lack of PUSH and POP instructions in ARM64, I'm having a problem with understanding how SP work in ARM64.
If I were to PUSH/POP, does the SP decrement/increment by 4, 8 or 16 bytes?
I'm reading documentations saying that the stack frame must be aligned by 16 bytes, but when I debug, it doesn't seemed to be the case exactly.
Whether the stack grows upwards or downwards is entirely dependent on the ABI of the system you're looking at. That said, all arm64 code I've had to do with had downwards-growing stacks.
With that, a common push would look like this:
stp x29, x30, [sp, -0x10]!
And a common pop like this:
ldp x29, x30, [sp], 0x10
This obviously pushes/pops two registers at once and thus modifies the stack pointer by 16 bytes at a time, which brings us to the next part:
The stack alignment check. Whether or not the stack pointer must be aligned to a 16-byte boundary is also dependent on the ABI you're working with, but is an actual hardware feature that can be configured.
See the ARMv8 Reference Manual, SCTLR_EL[123]
include bits that turn this feature on or off for each exception level. Quote from SCTLR_EL1
, for example:
SA0, bit [4] SP Alignment check enable for EL0. When set to 1, if a load or store instruction executed at EL0 uses the SP as the base address and the SP is not aligned to a 16-byte boundary, then a SP alignment fault exception is generated. For more information, see SP alignment checking on page D1-2333. When ARMv8.1-VHE is implemented, and the value of HCR_EL2.{E2H, TGE} is {1, 1}, this bit has no effect on execution at EL0. In a system where the PE resets into EL1, this field resets to an architecturally UNKNOWN value. SA, bit [3] SP Alignment check enable. When set to 1, if a load or store instruction executed at EL1 uses the SP as the base address and the SP is not aligned to a 16-byte boundary, then a SP alignment fault exception is generated. For more information, see SP alignment checking on page D1-2333. When ARMv8.1-VHE is implemented, and the value of HCR_EL2.{E2H, TGE} is {1, 1}, this bit has no effect on the PE. In a system where the PE resets into EL1, this field resets to an architecturally UNKNOWN value.