I have a custom strategy built using composite
that draws from text
strategy internally.
Debugging another error (FailedHealthCheck.data_too_large
) I realized that drawing from the text
strategy can cause my composite strategy to be invoked roughly twice as often as expected.
I was able to reproduce the following minimal example:
@hypothesis.strategies.composite
def my_custom_strategy(draw, n):
"""Strategy to generate lists of N strings"""
trace("a")
value = [draw(hypothesis.strategies.text(max_size=256)) for _ in range(n)]
trace("b")
return value
@given(my_custom_strategy(100))
def test_my_custom_strategy(value):
assert len(value) == 100
assert all(isinstance(v, str) for v in value)
In this scenario, trace("a")
was invoked 206 times, whereas trace("b")
was only invoked 100 times. These numbers are consistent across runs.
More problematic, the gap increases the more times I call text(), and super-linearly. When n=200
, trace("a")
is called 305 times. n=400
, 984 times. n=500
or greater, the test reliably pauses and then completes after the 11th iteration (with only 11 iterations, instead of 100!)
What's happening here?
I suspect it's because you're running into the maximum entropy (about 8K) used to generate Hypothesis examples, if some of the strings you generate happen to be quite long. Setting a reasonable max_size
in the text strategy would help, if I'm right.
As a more general tip, shrinking can be more efficient if you use the lists()
strategy (or another collections strategy) rather than picking an integer and then that many elements. This is not a subtle problem though; if you haven't already noticed you don't need to do anything!