Recently I learned about the GENERATE
macro in Catch2 (from this video). And now I am curious about how it works internally.
Naively one would think that for a test case with k
generators (by a generator I mean one GENERATE
call site), Catch2 just runs each test case n1 * n2 * ... * nk
times, where ni
is the number of elements in the i
-th generator, each time specifying a different combination of values from those k
generators. Indeed, this naive specification seems to hold for a simple test case:
TEST_CASE("Naive") {
auto x = GENERATE(0, 1);
auto y = GENERATE(2, 3);
std::cout << "x = " << x << ", y = " << y << std::endl;
}
As expected, the output is:
x = 0, y = 2
x = 0, y = 3
x = 1, y = 2
x = 1, y = 3
which indicates the test case runs for 2 * 2 == 4
times.
However, it seems that catch isn't implementing it naively, as shown by the following case:
TEST_CASE("Depends on if") {
auto choice = GENERATE(0, 1);
int x = -1, y = -1;
if (choice == 0) {
x = GENERATE(2, 3);
} else {
y = GENERATE(4, 5);
}
std::cout << "choice = " << choice << ", x = " << x << ", y = " << y << std::endl;
}
In the above case, the actual invocation (not callsite) of GENERATE
depends on choice
. If the logic were implemented naively, one would expect there to be 8 lines of output (since 2 * 2 * 2 == 8
):
choice = 0, x = 2, y = -1
choice = 0, x = 2, y = -1
choice = 0, x = 3, y = -1
choice = 0, x = 3, y = -1
choice = 1, x = -1, y = 4
choice = 1, x = -1, y = 4
choice = 1, x = -1, y = 5
choice = 1, x = -1, y = 5
Notice the duplicate lines: the naive permutation still permutes the value of a generator even if it is not actually invoked. For example, y = GENERATE(4, 5)
is only invoked if choice == 1
, however, even when choice != 1
, the implementation still permutes the values 4 and 5, even if those are not used.
The actual output, though, is:
choice = 0, x = 2, y = -1
choice = 0, x = 3, y = -1
choice = 1, x = -1, y = 4
choice = 1, x = -1, y = 5
No duplicate lines. This leads me to suspect that Catch internally uses a stack to track the generators invoked and the order of their latest invocation. Each time a test case finishes one iteration, it traverses the invoked genrators in the reverse order, and advances each generator's value. If such advancement fails (i.e. the sequence of values inside the generator finishes), that generator is reset to its initial state (i.e. ready to emit the first value in sequence); otherwise (the advancement succeeded), the traversal bails out.
In psuedocode it would look like:
for each generator that is invoked in reverse order of latest invocation:
bool success = generator.moveNext();
if success: break;
generator.reset();
This explains the previous cases perfectly. But it does not explain this (rather obscure) one:
TEST_CASE("Non structured generators") {
int x = -1, y = -1;
for (int i = 0; i <= 1; ++i) {
x = GENERATE(0, 1);
if (i == 1) break;
y = GENERATE(2, 3);
}
std::cout << x << "," << y << std::endl;
}
One would expect this to run 4 == 2 * 2
times, and the output being:
x = 0, y = 2
x = 1, y = 2
x = 0, y = 3
x = 1, y = 3
(The x
changes before y
since x = GENERATE(0, 1)
is the last generator invoked)
However, this is not what catch actually does, this is what happens in reality:
x = 0, y = 2
x = 1, y = 2
x = 0, y = 3
x = 1, y = 3
x = 0, y = 2
x = 1, y = 2
x = 0, y = 3
x = 1, y = 3
8 lines of output, which is the first four lines repeated twice.
So my question is, how exactly is GENERATE
in Catch2 implemented? I am not looking particularly for detailed code, but a high-level description that could explain what I have seen in the previous examples.
Regarding the final example, let's just unroll the loop first.
TEST_CASE("Non structured generators") {
int x = -1, y = -1;
for (int i = 0; i <= 1; ++i) {
x = GENERATE(0, 1);
if (i == 1) break;
y = GENERATE(2, 3);
}
std::cout << x << "," << y << std::endl;
}
then becomes:
TEST_CASE("Non structured generators") {
int x = -1, y = -1;
x = GENERATE(0, 1);
y = GENERATE(2, 3);
x = GENERATE(0, 1);
std::cout << x << "," << y << std::endl;
}
Now we can clearly see the 2^3 = 8 permutations happening.
Why could we just unroll it? Because the generators are not actually tied to "line of code", nor are they objects instanced in the scope of the GENERATE
macro invocation.
Instead every time you use the GENERATE
macro, it's actually just alike to a hidden SECTION
group pushed onto a stack stored outside the test execution. So there is not one generator per line of code, but rather actually one hidden index on a stack per call.
Incrementing happens after the test execution finishes. All generators / option groups which have exhausted their options are cleared from the stack. The inner-most surviving generator is then permitted to advance one element for the next test execution.
Generators already popped from the stack re-register when invoked again.
In pseudocode:
after test execution:
while !stack.empty():
if stack.top().atLastElement():
stack.pop();
else:
stack.top().nextElement();
break;
if stack.empty():
testDone();
The SECTION
macro actually shares the very same stack - except that the options are not registering as a running index into the generators options, but rather as a set of unique section names within each scope (not the actual implementation, but easier to think of it this way).
The same logic applies though - once each and every SECTION
within a single scope reports "I'm already in that set", it also sets a atLastElement
flag and thereby results in leaving the group of sections.
So likewise with SECTION
- they are not identified by line of code or alike either, but only by their position in the stack option groups and then matched by name only.
Introducing variation into the control flow other than with GENERATE
or SECTION
macros can result in highly unexpected behavior due to these details.