What exactly is the difference between AverageTime and SingleShotTime in Java Microbenchmark Harness?

I think I understand the difference between AverageTime and Throughput. Let's say we have a method with the following annotations:

@Benchmark
@Fork(value = 1, warmups = 0)
@Measurement(iterations = 1, time = 1, timeUnit = TimeUnit.SECONDS)

If I use @BenchmarkMode(Mode.AverageTime), JMH will launch one fork (because @Fork(value = 1)), perform one iteration (because @Measurement(iterations = 1)), and execute method fooBenchmark as many times as it can complete the invocations (JMH calls it Level.Invocation in org.openjdk.jmh.annotations package) in full in at least one second (because @Measurement(time = 1, timeUnit = TimeUnit.SECONDS)). We divide the elapsed time by invocations. This gives the average time per invocation.

If I use @BenchmarkMode(Mode.Throughput), we divide the number of completed invocations by the elapsed time. This gives us the throughput.

Let's say we have a benchmark class with a state Foo:

@State(Scope.Benchmark)
internal open class FooBenchmark {
    class Foo {
        private val values = mutableListOf<String>()

        fun add() = values.add(values.size.toString())

        fun size() = values.size
    }

    private val foo = Foo()

    @Benchmark
    @Fork(value = 1, warmups = 0)
    @Measurement(iterations = 1, time = 1, timeUnit = TimeUnit.SECONDS)
    @BenchmarkMode(Mode.SingleShotTime)
    fun fooBenchmark(hole: Blackhole) {
        check(foo.size() == 0)
        foo.add()
        check(foo.size() == 1)
        hole.consume(foo)
    }
}

When I use @BenchmarkMode(Mode.SingleShotTime) I expect JMH to make exactly one invocation and measure the time of exactly this one invocation. But for me, for SingleShotTime, JMH calls the method fooBenchmark several times. 1 fork, 1 iteration, but several invocations. And it uses the same instance of the FooBenchmark class in different threads. This leads to a conflict when accessing the foo state.

This is just an example of how multiple calls are definitely happening. I do NOT need to access data from different threads.

I don't understand how each of the JMH modes works? Or did I configure the SingleShotTime mode incorrectly?

This is what JMH produces when running the task:

> Task :lib:runBenchmark
# Detecting actual CPU count: 8 detected
# JMH version: 1.37
# VM version: JDK 19.0.1, OpenJDK 64-Bit Server VM, 19.0.1+10-21
# VM invoker: /opt/jdk-19.0.1/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=US -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: <none>
# Measurement: 1 iterations, 1 s each
# Timeout: 10000 ms per iteration
# Threads: 8 threads
# Benchmark mode: Single shot invocation time
# Benchmark: org.kepocnhh.jmh.FooBenchmark.fooBenchmark

Solution

TL;DR: You've configured JMH to execute the benchmark with more than one thread. Remove the -t=max argument from the command line.

Solution

You're right that SingleShotTime will have each thread execute the benchmark method only once per iteration. All other currently available modes are time-based; they will have each thread execute the benchmark method as many times as possible within the configured amount of time.

Your problem, based on your output, is the number of threads. From your comment, you're configuring that via command line arguments with -t=max. A value of max means to use as many threads as there are processors. In your case that's 8 threads. That's how many threads execute the benchmark method concurrently in a single iteration.

You probably only want one thread. That's the default value, so you should be fine just removing the -t argument. You can also configure the number for threads with the @Threads annotation. Note command line arguments take precedence over annotations.

Additional Info - How JMH Executes a Benchmark

Since your general understanding of how JMH executes a benchmark seems incomplete to me, a typical flow of JMH looks like the following:

Benchmark
- Mode and other params
  - Warmup fork (if any)
    - Warmup iteration (if any)
      - Invoke method (operation)
    - Repeat "warmup iteration" as many times as configured
    - Measurement iteration
      - Invoke method (operation)
    - Repeat "measurement iteration" as many times as configured
  - Repeat "warmup fork" as many times as configured
  - Measurement fork
    - Warmup iteration (if any)
      - Invoke method (operation)
    - Repeat "warmup iteration" as many times as configured
    - Measurement iteration
      - Invoke method (operation)
    - Repeat "measurement iteration" as many times as configured
  - Repeat "measurement fork" as many times as configured
- Repeat "mode and other params" for all combinations of modes and params
Repeat "benchmark" for each selected benchmark

Some configurations may change that flow (e.g., "warmup mode", groups, etc.).

The benchmark method will be invoked once when the mode is SingleShotTime. It will be invoked as many times as possible within the configured duration for all other modes. Note that time is a "minimum". The last operation of an iteration may cause the iteration to go longer than the configured duration (expected behavior). If an iteration takes too long then it will be interrupted after a configurable timeout.

During an iteration, a benchmark method will be invoked by one or more threads. By default it will be one thread. If multiple threads, they will invoke the benchmark method concurrently. The number of threads can be configured via the @Threads annotation.

The number of forks can be configured via the @Fork annotation. The value element controls how many measurement forks there will be. The warmups element controls how many warmup forks there will be. All results from a warmup fork, including the results of its measurement iterations, will be ignored.

The number of warmup iterations and their duration can be configured via the @Warmup annotation. The results of warmup iterations are ignored.

The number of measurement iterations and their duration can be configured via the @Measurement annotation.

All these things can be configured via the command-line or programmatically as well.