I am running some experiments using the Java Microbenchmarking Harness, and in particular I am using the GC profiler. I have three questions about interpreting the results of the GC profiler.
Here is my benchmark:
@Benchmark
public void benchmarkPiecewiseDNFModel(Blackhole blackhole) {
// Run the method and consume the output to avoid dead code elimination
PiecewiseCausalModel model = creator.PiecewiseDNFModel(users);
blackhole.consume(model);
}
The method within the benchmark creates a large collection of Boolean formulas in disjunctive normal form. For a single run of this benchmark, with one warmup, I obtained the following results:
Benchmark (users) Mode Cnt Score Error Units
BenchmarkModelCreator.benchmarkPiecewiseDNFModel 650 ss 77.664 s/op
BenchmarkModelCreator.benchmarkPiecewiseDNFModel:gc.alloc.rate 650 ss 809.123 MB/sec
BenchmarkModelCreator.benchmarkPiecewiseDNFModel:gc.alloc.rate.norm 650 ss 65892091888.000 B/op
BenchmarkModelCreator.benchmarkPiecewiseDNFModel:gc.count 650 ss 61.000 counts
BenchmarkModelCreator.benchmarkPiecewiseDNFModel:gc.time 650 ss 332.000 ms
I am trying to interpret these results in the context of a single measurement iteration for this benchmark. I can't seem to find a clear answer to the following questions:
(1) What, precisely, is the GC profiler measuring, especially in gc.alloc.rate.norm? The 65 GB that is being reported is a cumulative number, so it seems like the GC profiler is summing up the memory allocation just before each garbage collection event? Is that accurate? Is it the average amount of memory allocated per second of my operation?
(2) How, precisely, is an operation defined? In particular, is an operation a single invocation of the method within my benchmark (creator.PiecewiseDNFModel)? Or, is operation defined In a more fine-grained way to do with different parts of the PiecewiseDNFModel itself?
(3) Is it not somewhat misleading to use the GC profiler to measure the memory footprint of a piece of code given its method of cumulative measurement? For example, if I have an operation which happens to take a long time but that only ever uses up, say, 100MB at a given moment, then the GC profiler will report a large number for gc.alloc.rate.norm, but the actual burden of the process on my machine is very low. It would seem like a profiler which reported on the max memory usage at any one time is more appropriate. If that is right (I am open to objections), then can anyone suggest a more appropriate method of profiling?
I thank in advance for clarification that can be provided on these three points.
The answer to your second question is that an 'operation' is each invocation of the @Benchmark
method (unless you override that by annotating the method with @OperationsPerInvocation
), so yes, in your case, it is the number of calls to creator.PiecewiseDNFModel
.
As to your first question, gc.alloc.rate is, as reported, the number of MB/second that your method allocates. gc.alloc.rate.norm is the number of bytes allocated per 'operation' (as defined above) invoked. These are both (as far as I understand) calculation using the same numerator.
As to your third question...
For an object, we can think of it as existing in one of 2 states - in-use, and garbage. An object in-use is one which is reachable and which might be used by the program in the future. An object is garbage if it is not reachable, but hasn't yet been collected. (There is a third state leaked, an object has leaked if it is reachable, will not become unreachable, but will never be used again, but that isn't necessary for this discussion).
Until we run a garbage collection, we have no way of knowing if an object is garbage or not. (In Java, as far as I know, there is no way to run the analysis without then performing the collection). So JMH cannot distinguish between code (c1) that allocates 50 MB during an operation, all of which is in-use until the operation exits, from code (c2) that allocates and then immediately makes unreachable, 10 MB, but does this 5 times per operation. In both cases, the operation allocated 50 MB and that is all that the JMH profiler is aware of.
That said, the primary purpose of something like JMH is to help you to compare two (or more) different implementations of the same "thing". Most often it is the relative difference of those scores which is interesting, not the absolute numbers.