javagarbage-collectionjvmmetaspace

why my JVM did 3 times of FullGC when it started


My job is a Flink TaskManager job. When it starts, it starts 3 times of GC. I don't know why when it reached the MetadataGCThreshold it increased Tenured Space, while Metaspace didn't change, and even increased between GCs.

I thought FullGC would decrease Tenured Space, and some space of Metaspace. 😯

Java HotSpot(TM) 64-Bit Server VM (25.211-b12) for linux-amd64 JRE (1.8.0_211-b12), built on Apr  1 2019 20:39:34 by "java_re" with gcc 7.3.0
Memory: 4k page, physical 4194304k(3951532k free), swap 0k(0k free)
CommandLine flags: -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/flink-jvm -XX:InitialHeapSize=2899312640 -XX:MaxDirectMemorySize=9223370937343148032 -XX:MaxHeap
Size=2899312640 -XX:NewSize=2147483648 -XX:+PrintFlagsFinal -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops 

# first GC, the Tenured space even increase?!
2.452: [Full GC (Metadata GC Threshold) 2.452: [Tenured: 0K->14532K(735232K), 0.0623627 secs] 167782K->14532K(2622720K), [Metaspace: 20876K->20876K(1067008K)], 0.0624443 secs] [Times: user=0.05 
sys=0.01, real=0.07 secs] 

11.656: [Full GC (System.gc()) 11.656: [Tenured: 14532K->27057K(735232K), 0.1974158 secs] 395065K->27057K(2622720K), [Metaspace: 32454K->32454K(1077248K)], 0.1975190 secs] [Times: user=0.18 sys=
0.02, real=0.20 secs] 

65.942: [Full GC (Metadata GC Threshold) 65.942: [Tenured: 27057K->55025K(735232K), 0.2259224 secs] 512544K->55025K(2622720K), [Metaspace: 53760K->53760K(1097728K)], 0.2262134 secs] [Times: user
=0.18 sys=0.03, real=0.23 secs] 

================================================================= FollowUP thanks to Stephen's answer, I set the initialSize of MetaSpace by -XX:MetaspaceSize=134217728. I start the job twice, second time with this jvm param. had no FullGC because of Metadata GC Threshold.

######  之前: first time
bash-4.2$ tailf gc_log_2020-08-26_07-19-05_pid84.log 
Java HotSpot(TM) 64-Bit Server VM (25.241-b26) for linux-amd64 JRE (1.8.0_241-b26), built on Jan  6 2020 08:59:51 by "java_re" with gcc 7.3.0
Memory: 4k page, physical 4194304k(3920180k free), swap 0k(0k free)
CommandLine flags: -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/flink-jvm -XX:InitialHeapSize=2899312640 -XX:MaxDirectMemorySize=9223370937343148032 -XX:MaxHeapSize=2899312640 -XX:NewSize=2147483648 -XX:+PrintFlagsFinal -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops 
2.359: [Full GC (Metadata GC Threshold) 2.359: [Tenured: 0K->14643K(735232K), 0.0526437 secs] 167782K->14643K(2622720K), [Metaspace: 20937K->20937K(1067008K)], 0.0527307 secs] [Times: user=0.04 sys=0.01, real=0.05 secs] 
14.576: [Full GC (System.gc()) 14.576: [Tenured: 14643K->27136K(735232K), 0.1894478 secs] 417611K->27136K(2622720K), [Metaspace: 32274K->32274K(1077248K)], 0.1895372 secs] [Times: user=0.15 sys=0.02, real=0.19 secs] 

######  添加了metadata参数之后(-XX:MetaspaceSize=134217728), second time with param
bash-4.2$ tailf gc_log_2020-08-26_07-23-19_pid85.log 
Java HotSpot(TM) 64-Bit Server VM (25.241-b26) for linux-amd64 JRE (1.8.0_241-b26), built on Jan  6 2020 08:59:51 by "java_re" with gcc 7.3.0
Memory: 4k page, physical 4194304k(3920136k free), swap 0k(0k free)
CommandLine flags: -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/flink-jvm -XX:InitialHeapSize=2899312640 -XX:MaxDirectMemorySize=9223370937343148032 -XX:MaxHeapSize=2899312640 -XX:MetaspaceSize=134217728 -XX:NewSize=2147483648 -XX:+PrintFlagsFinal -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops 
5.196: [Full GC (System.gc()) 5.196: [Tenured: 0K->27063K(735232K), 0.2379232 secs] 973363K->27063K(2622720K), [Metaspace: 32544K->32544K(1077248K)], 0.2380344 secs] [Times: user=0.25 sys=0.03, real=0.24 secs] 

Solution

  • Why my JVM did 3 times of FullGC when it started?

    The first and third Full GCs were triggered by metaspace hitting its current threshold and needing to grow. Since metaspace is primarily used to hold code-related things, this implies that Flink is loading a lot of code or generating a lot of dynamic proxies or something like that.

    You might consider increasing the initial metaspace size.

    The second Full GC was triggered by an explicit System.gc() call. My guess is that Flink is doing that.

    I thought FullGC would decrease Tenured space.

    Not necessarily. If the Full GC was triggered by metaspace filling, then you could well find that a lot of objects in new space were tenured. In the first case, it is clear that that has happened, since prior to the GC, tenured space was empty.

    ... while metaspace don't change, and even increase between GCs

    That means that the GC is not finding any garbage in metaspace. I would only expect metaspace to usage to decrease if a ClassLoader instance (and all of its classes, and all of their instances) became unreachable. It is expected that metaspace usage will increase during startup. It is only potentially concerning if it continues increasing after the application has started and the JVM has full warmed up.