linux-kernelsystem-callsarm64context-switchsve

ARM64 SVE registers not preserved when issuing a syscall, why does Linux discard SVE registers with sve_user_discard()?


In the Linux 5.10 AArch64 syscall.c source code, there is a function sve_user_discard(), which can make the SVE registers' [max:128] bits being zeroed. Here's the code.

I cannot understand the usage of this function. Can anyone explain it to me?

I got a bug when running a SVE program and finally found the root cause is the sve_user_discard(). This function makes the SVE registers in userspace only remain bits [127:0], which makes my program not work as expected.


Solution

  • After digging around a bit, it turns out that all I needed to do was RTFM :'). This behavior is intended and documented.

    The Linux ARM64 Syscall ABI explicitly resets bits above 127 of the Z registers, all bits of the P registers and the FFR register on syscall entry. Therefore, there is no way to preserve full SVE state on ARM64 Linux between system calls. If your code uses SVE, you will need to split it up so that SVE is only used between syscalls. Even with a vector length of 128 bits, you would still lose the values of P0..P15 and FFR upon syscall entry.

    Here's a quote from kernel documentation on SVE confirming this behavior. I highlighted the interesting parts in bold:

    3. System call behaviour

    • On syscall, V0..V31 are preserved (as without SVE). Thus, bits [127:0] of Z0..Z31 are preserved. All other bits of Z0..Z31, and all of P0..P15 and FFR become unspecified on return from a syscall.

    • The SVE registers are not used to pass arguments to or receive results from any syscall.

    • In practice the affected registers/bits will be preserved or will be replaced with zeros on return from a syscall, but userspace should not make assumptions about this. The kernel behaviour may vary on a case-by-case basis.