linuxarmsignalsarm64

Arm64/Linux: "Handle" signal in userspace program by jumping to (PC + 8)?


I've been playing around today, and I was wondering whether it's possible for a linux userspace program to "handle" a signal by just skipping the offending instruction. The prototype is just

#include <iostream>
#include <csignal>

void signal_handler(int signal, siginfo_t* info, void* unused) {
        if (signal != SIGSEGV) {
                // Just here in case, I've never seen this hit
                std::cerr << "Got an unexpected signal " << signal << std::endl;
                exit(1);
        }

        std::cout << "Got a SIGSEGV with si_addr = " << info->si_addr << std::endl;
        // Without doing anything more here... the failing access will continue to SIGSEGV forever!
        __asm__ volatile (
                "b %0" : : "r" (info->si_addr + 8)
        );
        return;
}

int main() {
        struct sigaction action;
        memset(&action, 0, sizeof(action));
        action.sa_flags = SA_SIGINFO;
        action.sa_sigaction = signal_handler;

        sigaction(SIGSEGV, &action, nullptr);


        const volatile uint64_t* root_ptr = reinterpret_cast<uint64_t*>(0xdaaf3254 << 12);
        // Try to access this: it's probably a segfault due to the memory not being
        // backed by Linux yet!
        *root_ptr;
        std::cout << "Successfully got past the failing access\n";
}

After messing around with this a bit, I've realized there are some major issues

  1. I'm not familiar with arm64 assembly or c/c++ inline assembly (Does that instruction look right at all?) :)

  2. Because Linux interposes itself between the CPU exception and the userspace-side signal handler (like signal(7) mentions), the stack/frame/LR pointers when signal_handler runs is very likely different at signal_handler than at *root_ptr

  3. Similarly, all of the other registers could also be different!

  4. The signal handler runs in signal context, which is restrictive.

I figure that (2) and (3) are solvable just by saving/restoring the registers to a static location in memory at the appropriate moments. And my guess is that (4) won't bite if the userspace code is safe to run in signal context (crunching some numbers say, or just spinning forever).

So, let's modify the above program a bit

#include <iostream>
#include <csignal>

static volatile uint64_t register_save_set[32];

// Lines 2-3 save the stack pointer, which is a special snowflake
#define SAVE_REGISTERS() do {          \
    __asm__ volatile (                 \
        "stp x0, x1, [%0]\n"          \
        "mov x0, sp\n"                \
        "str x0, [%0, #248]\n"       \
        "stp x2, x3, [%0, #16]\n"     \
        "stp x4, x5, [%0, #32]\n"     \
        "stp x6, x7, [%0, #48]\n"     \
        "stp x8, x9, [%0, #64]\n"     \
        "stp x10, x11, [%0, #80]\n"   \
        "stp x12, x13, [%0, #96]\n"   \
        "stp x14, x15, [%0, #112]\n"  \
        "stp x16, x17, [%0, #128]\n"  \
        "stp x18, x19, [%0, #144]\n"  \
        "stp x20, x21, [%0, #160]\n"  \
        "stp x22, x23, [%0, #176]\n"  \
        "stp x24, x25, [%0, #192]\n"  \
        "stp x26, x27, [%0, #208]\n"  \
        "stp x28, x29, [%0, #224]\n"  \
        "str x30, [%0, #240]\n"       \
        :                               \
        : "r" (register_save_set)     \
        : "memory", "x0"              \
    );                                  \
} while (0)

// Lines 1-2 load the stack pointer, which is a special snowflake
#define LOAD_REGISTERS() do {          \
    __asm__ volatile (                 \
        "ldr x0, [%0, #248]\n"        \
        "mov sp, x0\n"                \
        "ldp x0, x1, [%0]\n"          \
        "ldp x2, x3, [%0, #16]\n"     \
        "ldp x4, x5, [%0, #32]\n"     \
        "ldp x6, x7, [%0, #48]\n"     \
        "ldp x8, x9, [%0, #64]\n"     \
        "ldp x10, x11, [%0, #80]\n"   \
        "ldp x12, x13, [%0, #96]\n"   \
        "ldp x14, x15, [%0, #112]\n"  \
        "ldp x16, x17, [%0, #128]\n"  \
        "ldp x18, x19, [%0, #144]\n"  \
        "ldp x20, x21, [%0, #160]\n"  \
        "ldp x22, x23, [%0, #176]\n"  \
        "ldp x24, x25, [%0, #192]\n"  \
        "ldp x26, x27, [%0, #208]\n"  \
        "ldp x28, x29, [%0, #224]\n"  \
        "ldr x30, [%0, #240]\n"       \
        :                               \
        : "r" (register_save_set)     \
        :                     \
    );                                  \
} while (0)
// These should be "clobber" in above... but GCC yells at me about
// impossible constraints :)
// "sp", "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9", "x10", "x11", "x12", "x13", "x14", "x15", "x16", "x17", "x18", "x19", "x20", "x21", "x22", "x23", "x24", "x25", "x26", "x27", "x28", "x29", "x30"

void signal_handler(int signal, siginfo_t* info, void* unused) {
        if (signal != SIGSEGV) {
                std::cerr << "Got an unexpected signal " << signal << std::endl;
                exit(1);
        }

        std::cout << "Got a SIGSEGV with si_addr = " << info->si_addr << std::endl;
        // Without doing anything more here... the failing access will continue to SIGSEGV forever!
        LOAD_REGISTERS();
        __asm__ volatile (
                "br %0" : : "r" (info->si_addr + 8)
        );
        return;
}

int main() {
        struct sigaction action;
        memset(&action, 0, sizeof(action));
        action.sa_flags = SA_SIGINFO;
        action.sa_sigaction = signal_handler;

        sigaction(SIGSEGV, &action, nullptr);

        const volatile uint64_t* root_ptr = reinterpret_cast<uint64_t*>(0xdaaf3254 << 12);
        SAVE_REGISTERS();
        *root_ptr;
        volatile int spin = 0;
        while (true) { spin += 1; }
}

I'm pretty sure that the input/output constraints on the inline assembly are... just wrong, but my hope is that by just making everything "volatile" I sidestep this.

Anyways, by my understanding: this should "just work," ie hit the while loop and spin forever. But on the other hand, when I actually try

$ g++ $PROGRAM -o a.out && ./a.out
Got a SIGSEGV with si_addr = 0xf3254000   # Expected
zsh: bus error  ./build/a.out             # NOT expected

So I have two questions

  1. Is my conceptual understanding here correct? If not, what am I missing?
  2. If it is (more or less) correct, why does the program not hit the while(true) as expected?

Solution

    1. To skip one instruction, you want to use +4, not +8.
    2. Your b %0 assembly does not compile. b is used for direct branch (i.e. by label/pc-relative offset), for indirect branch (i.e. by register) you want br.
    3. Writing inline assembly that interferes with compiler-allocated registers and may branch anywhere is an excruciatingly challenging thing to get right - much easier to do at function boundaries.
    4. Besides the general-purpose registers, you'd also have to save and restore the FP/SIMD registers...
    5. But once you do all that, you just arrive at your homebrew setjmp/longjmp implementation.
    6. And even longjmp from a signal handler is still just a really bad idea.

    But you really don't need to jump through any of those hoops, you don't even need to write any assembly. You just need to make use of your void* unused. You get a whole register state in there! Just include <ucontext.h>, cast to ucontext_t* and modify as you please.

    I patched up your code to do just that:

    #include <iostream>
    #include <csignal>
    #include <ucontext.h>
    
    void signal_handler(int signal, siginfo_t *info, void *context) {
        if (signal != SIGSEGV) {
            // Just here in case, I've never seen this hit
            std::cerr << "Got an unexpected signal " << signal << std::endl;
            exit(1);
        }
    
        std::cout << "Got a SIGSEGV with si_addr = " << info->si_addr << std::endl;
        ((ucontext_t*)context)->uc_mcontext.pc += 4;
    }
    
    int main() {
        struct sigaction action = {};
        action.sa_flags = SA_SIGINFO;
        action.sa_sigaction = signal_handler;
        sigaction(SIGSEGV, &action, nullptr);
    
        const volatile uint64_t* root_ptr = reinterpret_cast<uint64_t*>(0xdaaf3254 << 12);
        // Try to access this: it's probably a segfault due to the memory not being
        // backed by Linux yet!
        *root_ptr;
        std::cout << "Successfully got past the failing access\n";
    }
    

    Obligatory notice that *root_ptr invokes undefined behaviour and will probably break with optimisations turned on, blah blah.