Skip to content

Conversation

@llucax
Copy link
Contributor

@llucax llucax commented Nov 29, 2024

Hopefully this finally fixes the flaky hypothesis tests.

  • Remove deprecated uses of pytest-asyncio
  • Use a float for the tolerance in the timer tests
  • Revert "Use less extreme values for min and max timedelta in tests"

The `event_loop` fixture is deprecated and `event_loop_policy` should be
used instead.  The option `asyncio_default_fixture_loop_scope
= "function"` is also added to `pyproject.toml`, as it is also
deprecated to rely on a default.

Signed-off-by: Leandro Lucarella <[email protected]>
@llucax llucax requested a review from a team as a code owner November 29, 2024 12:11
@llucax llucax requested a review from shsms November 29, 2024 12:11
@github-actions github-actions bot added part:tests Affects the unit, integration and performance (benchmarks) tests part:tooling Affects the development tooling (CI, deployment, dependency management, etc.) labels Nov 29, 2024
@llucax llucax enabled auto-merge November 29, 2024 12:12
@llucax llucax added this to the v1.4.0 milestone Nov 29, 2024
@llucax llucax self-assigned this Nov 29, 2024
@llucax llucax added the type:bug Something isn't working label Nov 29, 2024
@llucax
Copy link
Contributor Author

llucax commented Nov 29, 2024

Tests failed again when queuing a PR, so here is another attempt.

@daniel-zullo-frequenz
Copy link
Contributor

Tests failed again when queuing a PR, so here is another attempt.

Hypothesis usually tells you how to reproduce the error by temporarily adding @reproduce_failure({PARAMETERS, FOR, THE, TEST, HERE}) as a decorator on the test case. Have you tried that to be sure the patch solves the issue?

@llucax
Copy link
Contributor Author

llucax commented Nov 29, 2024

Good tip. I actually validated it manually, but for a previous attempt and forgot to do it with the new approach. I will check with the decorator 👍 💯

@llucax llucax disabled auto-merge November 29, 2024 12:29
@llucax
Copy link
Contributor Author

llucax commented Nov 29, 2024

FYI, this was the failure:

______________________ test_policy_skip_missed_and_drift _______________________

    @hypothesis.given(
>       tolerance=st.integers(min_value=0, max_value=_max_timedelta_microseconds),
        **_calculate_next_tick_time_args,
    )

tests/test_timer.py:148: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

tolerance = 171726190479152817, now = 171726190479152817
scheduled_tick_time = -1, interval = 1

    @hypothesis.given(
        tolerance=st.integers(min_value=0, max_value=_max_timedelta_microseconds),
        **_calculate_next_tick_time_args,
    )
    def test_policy_skip_missed_and_drift(
        tolerance: int, now: int, scheduled_tick_time: int, interval: int
    ) -> None:
        """Test the SkipMissedAndDrift policy."""
        hypothesis.assume(now >= scheduled_tick_time)
    
        next_tick_time = SkipMissedAndDrift(
            delay_tolerance=timedelta(microseconds=tolerance)
        ).calculate_next_tick_time(
            now=now, interval=interval, scheduled_tick_time=scheduled_tick_time
        )
        if tolerance < interval:
            assert next_tick_time > now
        drift = now - scheduled_tick_time
        if drift > tolerance:
>           assert next_tick_time == now + interval
E           assert 0 == (171726190479152817 + 1)
E           Falsifying example: test_policy_skip_missed_and_drift(
E               tolerance=171_726_190_479_152_817,
E               now=171_726_190_479_152_817,
E               scheduled_tick_time=-1,
E               interval=1,  # or any other generated value
E           )

tests/test_timer.py:166: AssertionError

@daniel-zullo-frequenz
Copy link
Contributor

FYI, this was the failure

I see in this case hypothesis only mentioned how to reproduce the error/falsify the example.

When using an `int`, we need to do a double conversion, first to `float`
and then back to `int`, and due to rounding errors, this means there are
inconsistencies between the expected and actual values.

This is an example failure:

```
______________________ test_policy_skip_missed_and_drift _______________________

    @hypothesis.given(
>       tolerance=st.integers(min_value=0, max_value=_max_timedelta_microseconds),
        **_calculate_next_tick_time_args,
    )

tests/test_timer.py:148:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

tolerance = 171726190479152817, now = 171726190479152817
scheduled_tick_time = -1, interval = 1

    @hypothesis.given(
        tolerance=st.integers(min_value=0, max_value=_max_timedelta_microseconds),
        **_calculate_next_tick_time_args,
    )
    def test_policy_skip_missed_and_drift(
        tolerance: int, now: int, scheduled_tick_time: int, interval: int
    ) -> None:
        """Test the SkipMissedAndDrift policy."""
        hypothesis.assume(now >= scheduled_tick_time)

        next_tick_time = SkipMissedAndDrift(
            delay_tolerance=timedelta(microseconds=tolerance)
        ).calculate_next_tick_time(
            now=now, interval=interval, scheduled_tick_time=scheduled_tick_time
        )
        if tolerance < interval:
            assert next_tick_time > now
        drift = now - scheduled_tick_time
        if drift > tolerance:
>           assert next_tick_time == now + interval
E           assert 0 == (171726190479152817 + 1)
E           Falsifying example: test_policy_skip_missed_and_drift(
E               tolerance=171_726_190_479_152_817,
E               now=171_726_190_479_152_817,
E               scheduled_tick_time=-1,
E               interval=1,  # or any other generated value
E           )

tests/test_timer.py:166: AssertionError
```

Using `float` directly ensures we are comparing the same values in the
tests and in the code.

Some explicit examples are now included in the hypothesis tests to
ensure this issue is not reintroduced.

Signed-off-by: Leandro Lucarella <[email protected]>
Tests failed because of the double conversion fixes in the previous
commit, so we can remove this hack now.

This reverts commit 1084381.

Signed-off-by: Leandro Lucarella <[email protected]>
@llucax
Copy link
Contributor Author

llucax commented Nov 29, 2024

Yeah, I don't know how @reproduce_failure is used exactly, but example() works pretty well. Adding some examples, although I can't test the falsifying example precisely because we switched from int to float for the problematic value.

With this in mind, it seems like it fixes the issue. Pushed some updates, adding the examples so they are always tested just in case.

@llucax llucax added this pull request to the merge queue Nov 29, 2024
Merged via the queue into frequenz-floss:v1.x.x with commit 10a29d7 Nov 29, 2024
14 checks passed
@llucax llucax deleted the fix-tests branch November 29, 2024 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

part:tests Affects the unit, integration and performance (benchmarks) tests part:tooling Affects the development tooling (CI, deployment, dependency management, etc.) type:bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants