Skip to content

sets() has potentially high redraw count from underlying strategies #4458

@qwekjh

Description

@qwekjh

I've been exploring the behavior of sets strategy (and similarly, lists with unique=True) and have observed a pattern where the elements strategy often performs a notable number of redraws for values that have already been generated and included within the current test run.

While I understand that sets correctly ensures the uniqueness of the elements in the resulting collection, this frequent re-drawing of duplicate values from the underlying elements strategy can become a performance consideration.

This performance aspect becomes particularly apparent when the elements strategy itself is computationally expensive. In my case, a custom strategy takes approximately 50ms to generate a single value. When combined with the high number of redraws I've seen, this can quickly lead to HealthCheck.too_slow warning.

It's worth noting that the standard strategies are not just about generating random data and might sometimes generate numerical or simple values that aren't immediately unique when combined with strategies like sets.

I've put together a minimal reproducible example to illustrate this behavior (main.py):

from hypothesis import given
import hypothesis.strategies as st

redraws = 0
seen = set() # This 'seen' set is reset for each test run by @given, but demonstrates the issue within a single example generation.

@st.composite
def composite_integers(draw):
    global redraws
    
    value = draw(st.integers()) # Or any slower strategy for that matter
    if value in seen:
        redraws += 1
    seen.add(value)
    print(f"Redraws: {redraws}, Current Value: {value}, Seen: {len(seen)}") # Added more context to print
    return value

@given(x=st.sets(composite_integers(), min_size=3, max_size=3))
def test_strategy(x: set[int]):
    # Reset for clarity in next test run, though Hypothesis itself handles example generation isolation.
    global redraws
    global seen
    print("new test run")
    redraws = 0
    seen = set()
    assert True

Running this example with pytest --hypothesis-show-statistics --hypothesis-seed=1000 -s main.py provides an illustration (other seeds showed similar trends). The first test example generated 9 unique values but involved 92 redraws. Subsequent examples also exhibited consistent patterns, typically requiring between 6 and 12 redraws to successfully draw 6 unique values. Across all test runs, the cumulative redraws count reached 831.

The results of the statistics are as follows:

============================ Hypothesis Statistics =============================

main.py::test_strategy:

  - during generate phase (3.34 seconds):
    - Typical runtimes: < 1ms, of which < 1ms in data generation
    - 100 passing examples, 0 failing examples, 348 invalid examples
    - Events:
      * 73.88%, Retried draw from <hypothesis.strategies._internal.core.CompositeStrategy object at 0x7fa6a00cfd10>.filter(not_yet_in_unique_list) to satisfy filter
      * 1.34%, invalid because: Aborted test because unable to satisfy <hypothesis.strategies._internal.core.CompositeStrategy object at 0x7fa6a00cfd10>.filter(not_yet_in_unique_list)

  - Stopped because settings.max_examples=100


============================== 1 passed in 3.46s ===============================

This was tested on Windows WSL (Debian Bookworm), Python 3.12.10 and Hypothesis 6.135.17:

I'm hoping to gain some insight into whether there are recommended approaches or existing mechanisms to mitigate the performance impact of these redraws. Specifically, I'm curious about:

  1. Are there any recommended patterns or alternative ways to structure strategies when sets is consuming values from a potentially slow or frequently-duplicating source, to reduce the overall generation time?

  2. Are there any future plans or existing internal mechanisms within Hypothesis to make the generation of unique elements more efficient, particularly when dealing with costly element strategies? Could sets potentially provide more direct feedback or guidance to its element strategies to reduce the generation of known duplicates?

  3. If this is an inherent characteristic of how sets works given Hypothesis's design principles, what would be the ideal way to handle this situation for users utilizing more computationally expensive strategies?

Any guidance or deeper understanding of this behavior would be greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancego faster! use less memory!

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions