java jvm java-native-interface jvm-hotspot jvmti

Why when processing JVMTI EVENT_EXCEPTION the JVM sometimes crashes or starts slowly?

I decided using JVMTI to register all exceptions occured in JVM and build Histogram:
ClassName_method_ExceptionName:count

Agent attached like this:
taskset -c 10,12 java -agentpath:./jni-agent.so ${JVM_OPTION} -cp module.jar org.springframework.boot.loader.PropertiesLauncher &

I using this code for JNI agent:

enter code here
#include <stdio.h>
#include <jvmti.h>
#include <jni.h>
#include <map>
#include <string>
#include <cstring>
#include "utils.h"

long excCount;
using namespace std;

map<string, long> mapExceptions;
map<string, long> mapContendedThreadStack;

void mapAddException(map<string,long> &inMap, string key){
    map<string, long> :: iterator it = inMap.find(key);
    if(it == inMap.end()){
        inMap.insert(pair<string, long>(key,1));
        return;
    }
    inMap.find(key)->second ++;
    return;
}
..
..

void JNICALL ExceptionCallback(jvmtiEnv* jvmti, JNIEnv *env, jthread thread, jmethodID method,     jlocation location,
jobject exception, jmethodID catch_method, jlocation catch_location) {
    excCount ++;
    char *class_name;
    jclass exception_class = env->GetObjectClass(exception);
    jvmti->GetClassSignature(exception_class, &class_name, NULL);

    char* method_name;
    jvmti->GetMethodName(method, &method_name, NULL, NULL);
    jclass holder;
    jvmti->GetMethodDeclaringClass(method, &holder);

    char* holder_name;
    jvmti->GetClassSignature(holder, &holder_name, NULL);
    if(strstr(holder_name, "java/lang") != NULL
        || strstr(holder_name, "java/net") != NULL
        || strstr(holder_name, "sun/reflect") != NULL
        || strstr(holder_name, "org/springframework/boot/loader/jar/JarURLConnection") != NULL
        || strstr(holder_name, "okhttp3/internal/connection/RealConnection") != NULL
        || strstr(holder_name, "okio/AsyncTimeout") != NULL ){
            jvmti->Deallocate((unsigned char*) method_name);
            jvmti->Deallocate((unsigned char*) holder_name);
            jvmti->Deallocate((unsigned char*) class_name);
            return;
    }
    string trimres = trimClassName(convertToString(holder_name, strlen(holder_name)));      
    char buftotal[1024];
    snprintf(buftotal, sizeof(buftotal) - 1, "%s#%s()_%s", trimres.c_str(), method_name, trimClassName(class_name).c_str());
    replacechar(buftotal, '/', '.');
    
    //mapAddException(mapExceptions, buftotal);       <-- HERE 

    jvmti->Deallocate((unsigned char*) method_name);
    jvmti->Deallocate((unsigned char*) holder_name);
    jvmti->Deallocate((unsigned char*) class_name);
}

extern "C" JNIEXPORT jint JNICALL Agent_OnLoad(JavaVM *vm, char *options, void *reserved) {

    jvmtiCapabilities capabilities;
    jvmtiEventCallbacks callbacks;
    jvmtiEnv *jvmti;

    vm->GetEnv((void **) &jvmti, JVMTI_VERSION_1_0);
    printf("\nNative agent loaded by -agent.....\n");
    capabilities = {0};
    capabilities.can_generate_exception_events = 1;
    jvmti->AddCapabilities(&capabilities);
    callbacks = {0};
    callbacks.Exception = ExceptionCallback;
    jvmti->SetEventCallbacks(&callbacks, sizeof(callbacks));
    jvmti->SetEventNotificationMode(JVMTI_ENABLE, JVMTI_EVENT_EXCEPTION, NULL);
    return 0;
}

With commented line mapAddException(...) the JVM started much longer than without the JNI-agent. Is this Ok ?
I forgot to say - this is Spring Boot application with "Hello World" all ok :)

But when I uncommented mapAddException(...) JVM sometimes crashed. Not at all, sometimes(~ 2-4 of 50).

This is cut from CrashDump:

..
#  SIGSEGV (0xb) at pc=0x00007f32781bf0b4, pid=47559, tid=0x00007f31e29e1700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_231-b11) (build 1.8.0_231-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.231-b11 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libstdc++.so.6+0x750b4]
..
#
---------------  T H R E A D  ---------------

Current thread (0x00007f3275d82000):  JavaThread "tomcat-thread-16" daemon [_thread_in_native, id=47682, stack(0x00007f31e28e1000,0x00007f31e29e2000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000010

Registers:
RAX=0x0000000000000000, RBX=0x00007f3190009340, RCX=0x0000000000000001, RDX=0x00007f3274ba0eb0
RSP=0x00007f31e29dc7c8, RBP=0x00007f327414fea0, RSI=0x00007f32786541d0, RDI=0x00007f327414fea0
R8 =0x0000000000000002, R9 =0x0000000000000030, R10=0x000000000000000c, R11=0x00007f327a316f40
R12=0x00007f31a00504f0, R13=0x00007f32786541c8, R14=0x00007f32786541d0, R15=0x00007f3190009340
RIP=0x00007f32781bf0b4, EFLAGS=0x0000000000010283, CSGSFS=0x0000000000000033, ERR=0x0000000000000004
TRAPNO=0x000000000000000e
....

The Current thread are different every time at CrashDump.

Solution

The crash reason is simple: your code is not thread safe.

Exception callback is invoked on the same thread that throws an exception. If multiple threads throw an exception concurrently, the callback is also invoked concurrently. std::map is not thread safe: concurrent access may cause a crash inside libstdc++.

The reason of slowness is more interesting.

How to analyze performance issues related to interaction between JVM and native? With async-profiler, of course. When profiling application startup, it's convenient to start profiler as an agent.

java -agentpath:/path/to/libasyncProfiler.so=start,event=cpu,file=out.svg -agentpath:./your-agent.so -cp ...

When I compared profiles with and without your agent, I noticed one major difference. With the agent, there was a noticeable [deoptimization] block on the left side:

Why does deoptimization happen? async-profiler will help again: let's profile Deoptimization::deoptimize event now. It's not really an event, this is just a function name in HotSpot JVM (async-profiler can trace invocations of native functions).

java -agentpath:/path/to/libasyncProfiler.so=start,event=Deoptimization::deoptimize,file=deopt.svg ...

There was more than 8000 deoptimization events! Here is just a small part of the graph:

Most deoptimizations originate from exception_handler_for_pc_helper. If we look at the source code of this function, we'll see an interesting comment:

  if (JvmtiExport::can_post_on_exceptions()) {
    // To ensure correct notification of exception catches and throws
    // we have to deoptimize here.

    <...>

    // We don't really want to deoptimize the nmethod itself since we
    // can actually continue in the exception handler ourselves but I
    // don't see an easy way to have the desired effect.
    Deoptimization::deoptimize_frame(thread, caller_frame.id());

What this means is - when JVM TI Exception notifications are on, HotSpot forces deoptimization every time an exception happens in the compiled code! This can surely hit performance if exceptions are frequent.

How to analyze exceptions without JVM TI overhead then?

At this point you may already guess the answer - with async-profiler again :) Besides CPU and native function profiling, it can also intercept any Java method call.

To find all places where exceptions and errors are created, we can just intercept Throwable constructors. To do so, specify java.lang.Throwable.<init> as async-profiler event. This time it's handy to produce a reversed call tree (add reversed,file=tree.html options), so that throwables will be grouped by the type and the construction site.

java -agentpath:/path/to/libasyncProfiler.so=start,event=java.lang.Throwable.\<init\>,reverse,file=tree.html ...

This will produce an interactive HTML tree, where you'll find all exceptions with counters and full stack traces:

Ouch! A simple Spring Boot application throws 4700 exceptions: most of them are ClassNotFoundException (90%) and NoSuchMethodException (7%).

What is important, async-profiler uses bytecode instrumentation for method tracing, so it does not suffer from the performance problem of JVM TI exception callback.

Bonus

When profiling your agent for the first time, besides the deoptimization issue, I also found a decent number of stack traces where CPU time is spent in your ExceptionCallback calling GetMethodDeclaringClass and GetMethodName:

This is actually a performance bug in JDK 8, which I reported 3 years ago: JDK-8185348. At last, it has been fixed in JDK 8u281. However, there was no such problem in JDK 9+.

In short, since all JVM TI Method functions are slow, it is a good idea to cache the result of these functions, so that the agent will not call JVM TI function for the same method more than once.