jvmjvm-hotspotg1gcout-of-memorynmt

Linux OOM-Killer and G1 GC memory consumption


I have a Java application running on Liberica JDK 8 (HotSpot VM, G1 GC) on an Oracle Linux machine with 24 GB RAM. The application has -Xmx15g max heap size, utilizes it heavily (due to its load profile) and is the only process with such demands on the server.

From time to time (usually after dozens of uptime hours), the application gets killed by Linux oom-killer. To find the root cause, I enabled JVM Native Memory Tracking (NMT) in detailed mode, established the baseline soon after warming up, and gathered the following stats for 34 uptime hours (right before the process was killed once more): NMT results

According to the Oracle's NMT Memory Categories documentation, I expected the heaviest Internal category to be filled with something like Direct ByteBuffers. However, the NMT details shown that almost 70% of those 3 GB of Internals are composed of allocations like this:

[0x00007faeb7f9b5b5] BitMap::BitMap(unsigned long, bool)+0x1d5
[0x00007faeb82ca08f] OtherRegionsTable::add_reference(void*, int)+0x57f
[0x00007faeb82e0f40] InstanceKlass::oop_oop_iterate_nv(oopDesc*, FilterOutOfRegionClosure*)+0xc0
[0x00007faeb82c3373] HeapRegion::oops_on_card_seq_iterate_careful(MemRegion, FilterOutOfRegionClosure*, signed char*)+0x163
                             (malloc=1154790KB type=Internal +846638KB #577395 +423319)

In total, there were 8 such blocks with oops_on_card_seq_iterate_careful(…) method invocations having various classes on the next stack frame, e.g.:

Basing on these identifiers, I found out that these routines are a part of G1 GC. However, I couldn't see a way to influence their behavior (memory consumption) from G1 parameters available among corresponding JVM options.

Looking at this related SO answer, I've tried to increase the -XX:G1HeapRegionSize from its ergonomically computed 4 MB to manually set 8 MB, but no significant changes were observed.

So the questions are:

  1. Why purely G1-related activities are recorded by NMT into Internal category? (not GC)
  2. Is there a way to make G1 consume less native memory? (expect changing it to other GC which is not an option in this case)

Solution

  • Why purely G1-related activities are recorded by NMT into Internal category? (not GC)

    NMT does not know it's G1-related activity: it does not walk the stack to find out the allocation type, it just uses the type passed to the allocation function.

    As you can see in the stack trace, the allocation happens in the BitMap constructor. This is a general purpose class used in many places, not just GC. BitMap class has an allocator associated with mtInternal type:

      ArrayAllocator<bm_word_t, mtInternal> _map_allocator;
    

    In newer JDK versions, BitMap has a variable allocation type passed from outside.

    Is there a way to make G1 consume less native memory?

    Migrate to JDK 17 or newer. G1GC got tons of improvements that will never be backported to JDK 8. Tuning GC in JDK 8 is a goldmine for performance consultants, but if you care about time and resources - upgrading JDK is the best investment.