javaapache-flinkflink-streaming

Flink task manager does not unload classes


I have Standalone Flink Cluster. When I stop the process on Task Manager, classes that were loaded as ChildFirst are not removed. After several start/stop repetitions, the metaspace exceeds the maximum value and throws OutOfMemory Metaspace. I run servals application in one task manager. Application are different so a cannot add the JAR file to /lib folder. Restarting TaskManager after severals operation is bothersome.

VisualVM metaspace overview

Heap dump with dominators

Flink 1.20.0 Java 17.0.11+9

I tried change the class loader order to parent-first as the suggest in doc https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/debugging/debugging_classloading/ but this not helped. Even then, the classes are loaded as Child First.

EDIT

I found that MySQL is the problem enter image description here enter image description here


Solution

  • The thread for cleaning abandoned connections is created in the JVM as static, so the GC can't do anything with it, which ultimately blocked unloading classes. In driver version 8.x, you can disable the creation of this thread by adding to the JVM parameter

    -Dcom.mysql.cj.disableAbandonedConnectionCleanup=true
    

    In Flink you can add this to config.yaml file like this:

    env:
       java:
          opts:
             all: -Dcom.mysql.cj.disableAbandonedConnectionCleanup=true
    

    In older versions of the driver, there is no such option and the only thing you can do is to disable this thread manually at the application startup by executing com.mysql.jdbc.AbandonedConnectionCleanupThread.shutdown();