pythonpython-hypothesis

With `hypothesis`, how to generate two values that satisfy an ordering relation?


When writing tests using hypothesis, from time to time I encounter a situation that I require two distinct values which satisfy a given relation. Think of the start and end of an interval, where start <= end is required.

A simple example of what I would like to achieve is:

import datetime as dt

from hypothesis import given
from hypothesis import strategies as st


@given(
    valid_from=st.dates(),
    valid_to=st.dates(),
)
def test_dates_are_ordered(valid_from: dt.date, valid_to: dt.date):
    assert valid_from <= valid_to

I like that this test is very easy to read and to the point. However, it does not pass, because Hypothesis does not know about the restriction. Is there a good way to have two parameters but still ensure the values are ordered?


Solution

  • To my knowledge, there are only workarounds.

    One workaround would be to use assume, which allows to mark examples as bad. However, invalid examples would still be present in the search space, which might slow down the sampling.

    import datetime as dt
    
    from hypothesis import assume, given
    from hypothesis import strategies as st
    
    
    @given(
        valid_from=st.dates(),
        valid_to=st.dates(),
    )
    def test_values_are_in_order(
        valid_from: dt.date, valid_to: dt.date
    ) -> None:
        assume(valid_from <= valid_to)
        assert valid_from <= valid_to
    

    Another option would be to sample two general examples and sort them. However, this complicates the code quite a bit, because sampling got much more complicated and the values have to be unpacked too.

    import datetime as dt
    import typing as t
    
    from hypothesis import given
    from hypothesis import strategies as st
    
    
    @given(
        valid_from_to=st.lists(st.dates(), min_size=2, max_size=2)
        .map(t.cast(t.Callable[[list[dt.date]], list[dt.date]], sorted))
        .map(t.cast(t.Callable[[list[dt.date]], tuple[dt.date, ...]], tuple))
    )
    def test_values_are_in_order(
        valid_from_to: tuple[dt.date, dt.date]
    ) -> None:
        valid_from, valid_to = valid_from_to
        assert valid_from <= valid_to
    

    Last and presumably also least, one could use a delta approach. Here, a start value and a delta is sampled and the end value computed from both. However, this requires that the notion of a delta and the notion of "adding" exists. It also might make it possible to sample invalid values, because whether or not start + delta is valid is much harder to tell in advance than if the end value was sampled directly. This materialises directly in the simple example below.

    import datetime as dt
    
    from hypothesis import given
    from hypothesis import strategies as st
    
    
    @given(
        valid_from=st.dates(),
        delta=st.timedeltas(min_value=dt.timedelta(1)),
    )
    def test_values_are_in_order(valid_from: dt.date, delta: dt.timedelta):
        # This test might cause an `OverflowError` to be raised, because
        # `valid_from` and `delta` might lead to date Python decided not to support
        # anymore.
        valid_to = valid_from + delta
        assert valid_from <= valid_to