I am working with Google Benchmark to measure the execution time of some code. For example, I wrote the following code to measure its execution time performance.
#include <benchmark/benchmark.h>
// Alternatively, can add libraries using linker options.
#ifdef _WIN32
#pragma comment ( lib, "Shlwapi.lib" )
#ifdef _DEBUG
#pragma comment ( lib, "benchmarkd.lib" )
#else
#pragma comment ( lib, "benchmark.lib" )
#endif
#endif
static void BenchmarkTestOne(benchmark::State& state) {
int Sum = 0;
while (state.KeepRunning())
{
for (size_t i = 0; i < 100000; i++)
{
Sum += i;
}
}
}
static void BenchmarkTestTwo(benchmark::State& state) {
int Sum = 0;
while (state.KeepRunning())
{
for (size_t i = 0; i < 10000000; i++)
{
Sum += i;
}
}
}
// Register the function as a benchmark
BENCHMARK(BenchmarkTestOne);
BENCHMARK(BenchmarkTestTwo);
// Run the benchmark
BENCHMARK_MAIN();
When the above code has run, it shows me the following results:
Benchmark Time CPU Iterations
-----------------------------------------------------------
BenchmarkTestOne 271667 ns 272770 ns 2635
BenchmarkTestTwo 27130981 ns 27644231 ns 26
But I couldn't figure out what is the meaning of Iterations here? And also why Time and CPU are different from each other?
Google Benchmark tries to benchmark each candidate for a similar amount of time, and/or for long enough to get stable results.
The benchmark counts how many iterations it actually did, along with the exact time. A much slower per-iteration benchmark will do far fewer iterations.
The printout is (calculated) per-iteration time, and (counted) iterations of the benchmark function.
It might actually be a count of calls to state.KeepRunning()
, but I don't know that level of detail.
Just FYI, your benchmark loops don't return any result or store it to a volatile
after the loop, so a compiler could easily optimize away the loop. Also note that signed overflow is UB in C, and your int
will pretty definitely overflow.
(Or clang could still optimize those sum loops into a closed form formula based on Gauss's n * (n+1) / 2
but avoiding overflow.)
Benchmarking with optimization disabled is useless; don't do it.