-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Add EMTEST_RETRY_COUNT to force retrying failed tests #25565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…y and all failing tests as flaky.
retries_left = test_retry_count | ||
|
||
num_fails = len(result.failures) | ||
num_errors = len(result.errors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these part of the non-parallel test runner too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these are part of the default unittest.py unittest.TestCase
implementation. Had to add them to the parallel test runner to match the shape of the upstream Python implementation. Verified that this works in single-threaded and parallel test runner.
The way this differs from the existing EMTEST_RETRY_FLAKY is that this retries any failed test, whereas EMTEST_RETRY_FLAKY only retries those tests that had explicitly been deemed to be flaky beforehand.
The rationale for adding this feature is twofold:
There is currently so much flakiness in tests in the current test suites, that I would be flagging flaky tests for many months to come in my own CI. It is unclear if some of that flakiness is a harness problem or a systemic problem rather than an individual test problem, so I could end up flagging a majority of all tests in the suites as flaky.
Whenever a test fails in my CI, the very first thing I need to check is whether the failure was just a one-off, or whether the failure was a deterministic failure. So being able to run with
EMTEST_RETRY_COUNT=5
will automate such testing for me and immediately give me feedback whether any test failure was deterministic or intermittent.