I suspect a slow memory leak caused by JNI code. I'm seeing the GC Roots # in the summary of consecutive heap dumps climbing. After two hours it showed 470,000 GC roots, after six hours, almost a million more GC roots, after 33 hours over 7 million GC roots.
However, when I look at the Head Dump that says there are 7 million GC roots, and I choose the "Objects" view and the "GC Roots" preset, I select Aggregation "Types". This list shows a total count of less than 15,000 objects:
Where are those 7 million roots?
Different GC roots can reffer to the same object instance. This explains the difference, since count from "GC root" view display number of unique instances. You can find more details using OQL. First let's display number of GC roots for particular GC type:
printHistogram()
function printHistogram() {
var roots = heap.roots()
var histoMap = new Array();
var result = new Array();
var cnt = 0;
while (roots.hasMoreElements()) {
var root = roots.nextElement();
var type = root.type;
if (histoMap[type] == undefined) {
histoMap[type] = 1;
} else {
histoMap[type]++;
}
}
for (var key in histoMap){
if (histoMap.hasOwnProperty(key)) {
result[cnt++] = { key: key, count: histoMap[key] };
}
return map(sort(result, "rhs.count - lhs.count"), '"Root count: "+it.count+" for type: "+it.key');
}
Running this query against your heap dump produces:
Root count: 12878043 for type: JNI local
Root count: 7858 for type: JNI global
Root count: 3599 for type: sticky class
Root count: 1631 for type: Java frame
Root count: 146 for type: thread object
Root count: 7 for type: monitor used
We can see that the majority of GC roots are of "JNI local" type. Let us see how many "JNI local" roots point to the same object instance. We can modify above query to:
printHistogram()
function printHistogram() {
var roots = heap.roots()
var histoMap = new Array();
var rootMap = new Array();
var result = new Array();
var cnt = 0;
while (roots.hasMoreElements()) {
var root = roots.nextElement();
if (root.type == "JNI local") {
var objid = root.id;
if (histoMap[objid] == undefined) {
histoMap[objid] = 1;
rootMap[objid] = root.referrer;
} else {
histoMap[objid]++;
}
}
}
for (var key in histoMap){
if (histoMap.hasOwnProperty(key)) {
result[cnt++] = { key: rootMap[key], count: histoMap[key] };
}
}
return map(sort(result, "rhs.count - lhs.count"), '"Root count: "+it.count+" for object: "+toHtml(it.key)');
}
The result is below:
Root count: 6439020 for object: java.lang.String#44429
Root count: 6439020 for object: java.lang.String#55081
Root count: 1 for object: java.nio.DirectByteBuffer#9
Root count: 1 for object: java.util.ArrayList#22281
Root count: 1 for object: java.lang.String#71518
We can see that two strings java.lang.String#44429
and java.lang.String#55081
are responsible for that huge number of GC roots. They have 6,5 million GC roots each.
Those "JNI local" GC root are referenced from MessageDispatherThread-1 (tid=216)
from frame ca.digitalrapids.kayak.jni.KayakNativeWorkerThread.runNative (Native Method)
. See screenshots below: