javajava-native-interfaceshared-libraries

How do I catch SIGSEGV, SIGALRM, and SIGFPE with sigaction() when using JNI?


My goal is to catch SIGSEGV (infinite recursion), SIGALRM (infinite loops, raised by a timer), and SIGFPE (division by 0) from native code, since I want to prevent Minecraft (Java) from crashing or hanging whenever a mod written in my compiled modding language has such issues.

I've been able to successfully use my modding language in other programs (written in C/C++/Python) using sigaction() and sigsetjmp(), but it isn't as easy in Java, as described by this Stack Overflow answer:

The Java VM (at least Oracles implementation, which also includes the OpenJDK) uses POSIX signals for internal communication, so it installs signal handlers e.g. for SIGSEGV. That means, to work correctly, the Java VM must get and examine any SIGSEGV which happens in the process, to distinguish between "communication" SIGSEGVs and real ones which would be program errors.

But signal handlers are a global resource, within a process any native code can install signal handlers and replace the ones the Java VM installed.

To solve this problem (user-installed signal handlers replace the JavaVM signal handlers) and to accomodate user code which may have a reason to install signal handlers, Java VM uses "signal chainging", which basically means that the signal handlers (VM and Users) are chained behind each other: the JavaVM signal handlers run first; if they think the signal is of no interest to the VM, it hands down the signal to the user handler.

The libjsig.so is the answer for this problem: it replaces the system signal APIs (sigaction() etc) with its own versions, and any user code attempting to install a signal handler will not replace the global signal handler but the user handler will be chained behind the (already installed) Java VM signal handler.

Note that it's incredibly easy to accidentally introduce undefined behavior when siglongjmp()ing out of a signal handler, as described in the APPLICATION USAGE section of this page. This is why my modding language blocks (using sigprocmask) and disables the signal handlers at strategic moments.

The quote recommends using jsig, which is easily achieved by putting LD_PRELOAD=<your jdk path>/libjsig.so before the java command.

Using jsig however guarantees that the JVM receives the SIGSEGV before my custom handler does. Since the JVM considers it unrecoverable, and apparently doesn't want to pass the responsibility of handling the SIGSEGV to my own handler, it does not call it, resulting in a crash.

If I don't use jsig it complains that it wants me to use it, since the JVM periodically checks whether the signal handler has been tampered with:

Warning: SIGSEGV handler modified!
Signal Handlers:
   SIGSEGV: segv_handler in libfoo.so, mask=11111111011111111101111111111110, flags=SA_ONSTACK, unblocked
  *** Handler was modified!
  *** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGBUS: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, unblocked
    SIGFPE: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, unblocked
   SIGPIPE: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, unblocked
   SIGXFSZ: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, unblocked
    SIGILL: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, unblocked
   SIGUSR2: SR_handler in libjvm.so, mask=00000000000000000000000000000000, flags=SA_RESTART|SA_SIGINFO, unblocked
    SIGHUP: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, unblocked
    SIGINT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, unblocked
   SIGTERM: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, unblocked
   SIGQUIT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO, blocked
   SIGTRAP: SIG_DFL, mask=00000000000000000000000000000000, flags=none, unblocked
Consider using jsig library.

It complains because my C code is stomping on JVM's own SIGSEGV handler, which it uses internally to do neat stuff like optimize away null-checks, which explains why doing this causes my game to sporadically crash.

How do I cleanly recover from the harmless SIGSEGV, SIGALRM, and SIGFPE signals that my own C code raises, without upsetting the JVM?


Solution

  • It took me a week, but I managed to figure something out!

    It unfortunately seems that using jsig here is impossible, and I don't know of a way to suppress the one warning that gets printed per overwritten signal handler (see this spot in the JVM), without suppressing other warnings. If someone in the future finds a way to use jsig, or to suppress only this one warning type, then please leave a comment or answer!

    The first important realization is that SIGSEGV, SIGALRM, and SIGFPE can be split into two categories:

    For example, a SIGSEGV in my modding language is always caused by a specific thread infinitely recursing a C function, as the stack has a limited amount of memory. This SIGSEGV is guaranteed to be received by the signal handler of that same thread. This is what I mean by thread-targeted.

    Handling thread-targeted signals

    If a thread causes a SIGSEGV, the signal handler can use pthread_self() to differentiate between whether a JVM thread entered it (like Minecraft's rendering thread, for example), or whether our C thread entered it:

    jmp_buf jmp_buffer;
    volatile pthread_t c_thread;
    volatile sigaction_handler_t jvm_segv_handler;
    
    // We're using SA_SIGINFO to get these three params,
    // since JVM's own signal handler wants them passed to it
    void segv_handler(int sig, siginfo_t *info, void *ucontext) {
        if (pthread_equal(pthread_self(), c_thread)) {
            siglongjmp(jmp_buffer, 1);
        } else {
            jvm_segv_handler(sig, info, ucontext);
        }
    }
    
    void run_infinite_recursion() {
        c_thread = pthread_self();
    
        static struct sigaction sigsegv_sa = {
            .sa_sigaction = grug_error_signal_handler_segv,
            // SA_RESTART is what the JVM uses, retrying primitive C functions
            // SA_SIGINFO allows us to pass all information to the original handler
            // SA_ONSTACK gives SIGSEGV its own stack
            .sa_flags = SA_RESTART | SA_SIGINFO | SA_ONSTACK,
        };
    
        // Using static stuff so calling the function we're in multiple times
        // doesn't cause us to recalculate the same stuff
        static bool initialized = false;
        if (!initialized) {
            // We want all signals to be blocked when our handler is running
            if (sigfillset(&sigsegv_sa.sa_mask) == -1) {
                perror("sigfillset");
                exit(EXIT_FAILURE);
            }
            initialized = true;
        }
    
        struct sigaction previous_segv_sa;
        if (sigaction(SIGSEGV, &sigsegv_sa, &previous_segv_sa) == -1) {
            perror("sigaction");
            exit(EXIT_FAILURE);
        }
    
        jvm_segv_handler = previous_segv_sa.sa_sigaction;
    
        if (sigsetjmp(jmp_buffer)) {
            printf("Infinite recursion detected!\n");
        }
    
        wee();
    }
    
    void wee(void) {
        wee();
    }
    

    Handling process-wide signals

    With the knowledge that:

    1. The JVM won't normally raise SIGALRM (unlike SIGSEGV and SIGFPE)
    2. Any of our threads can receive the SIGALRM, since neither timer_create() nor timer_settime() allow targeting a specific thread

    we can use pthread_kill() to manually pass the SIGALRM on to our desired thread:

    void alrm_handler(int sig) {
        if (pthread_equal(pthread_self(), c_thread)) {
            siglongjmp(jmp_buffer, 1)
        }
    
        if (pthread_kill(c_thread, SIGALRM) != 0) {
            abort();
        }
    }