-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Description
Hi! We're using pytest-subtests in our integration test suite along with bktec for automatic retries. We noticed that bktec is retrying significantly more tests than expected after a run with subtest failures.
This is likely a downstream effect of buildkite/test-collector-python#93, but I wanted to flag it here as well since bktec could potentially be more defensive about this case.
Worth noting: as of pytest 9.0, subtests have been integrated directly into pytest core (previously they were only available via the separate pytest-subtests plugin). The feature is marked as experimental but the core functionality and usage are considered stable. Given that subtests are now a first-class pytest feature, it's likely that more projects will start using them, and both the collector and bktec may want to handle them correctly.
What we observed
In a recent build (#16945):
- Attempt 1 (initial run, 417 tests): 2 tests had
FAILEDparent status. However, bktec retried 13 tests -- the 2 that actually failed plus 11 that had all subtests passing (SUBPASSED) and aPASSEDparent status. - Attempt 2 (retry 1 of 2): The 11 incorrectly-retried tests all passed (as expected, since they were never broken). The 2 actually-failed tests failed again.
- Attempt 3 (retry 2 of 2): 1 of the 2 recovered, 1 stayed failed. Final report correctly showed 1 failed test.
Root cause
bktec determines which tests to retry by reading the JSON file produced by buildkite-test-collector via the --json flag. The collector appears to be writing incorrect results for tests that use subtests (filed as buildkite/test-collector-python#93). So the primary fix likely belongs in the collector.
That said, bktec currently has no awareness of subtests at all (no handling of SUBPASSED/SUBFAILED statuses, no concept of subtest vs parent test). If the collector were to start reporting subtests as separate entries in the JSON in the future, bktec might try to retry individual subtests by nodeid, which wouldn't work since subtests aren't independently addressable in pytest.
Important: subtests cannot be retried individually
One key thing to call out -- subtests are often (and in our case always) sequential dependent steps in an e2e workflow. Each subtest depends on state set up by the previous one. This means retrying a single failed subtest in isolation is not possible. When any subtest fails, the correct behavior is to retry the entire parent test so all subtests run again from the beginning.
Even in cases where subtests are theoretically independent, pytest does not support running individual subtests by nodeid -- they are not separately addressable test items. So regardless of the use case, the retry unit must always be the parent test.
Suggestion
This might not need any changes on the bktec side if the collector fix fully resolves it. But if there's interest in making bktec more resilient here, one idea might be to validate that retry targets correspond to real pytest nodeids, or to recognize and filter out subtest-level entries in the JSON so that only parent test results drive retry decisions.
Happy to provide more details. Thanks for the great tool!