fix: deflake //rs/rosetta-api/icrc1:icrc_multitoken_rosetta_system_tests/multitoken_system_tests#9187
Merged
basvandijk merged 1 commit intomasterfrom Mar 4, 2026
Conversation
…sts/multitoken_system_tests Increase timeouts to handle resource contention when 4 tests run in parallel: 1. Increase MAX_ATTEMPTS in wait_for_rosetta_block from 20 to 60 (20s -> 60s) to give Rosetta more time to sync blocks under load. 2. Create RosettaClient with an explicit 120s timeout (instead of None which defaults to 60s) so make_submit_and_wait_for_transaction has sufficient time to find submitted transactions. Root cause: With RUST_TEST_THREADS=4, multiple Rosetta + PocketIC instances compete for resources, causing block sync and transaction confirmation to take longer than the previous tight timeouts allowed.
mbjorkqvist
approved these changes
Mar 4, 2026
Contributor
mbjorkqvist
left a comment
There was a problem hiding this comment.
Thanks @basvandijk !
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Root Cause
The multitoken Rosetta system tests run 21 tests with
RUST_TEST_THREADS=4, meaning 4 tests execute in parallel. Each test spawns its own Rosetta server and PocketIC instance. Under resource contention, block synchronization and transaction confirmation take longer than the tight timeouts allowed.Three failure modes were observed in the last week (7 flaky runs):
test_continuous_block_sync(2/7 runs):wait_for_rosetta_blockhadMAX_ATTEMPTS=20with 1-second sleeps, giving only ~20 seconds for Rosetta to sync. Under load, Rosetta couldn't sync in time (e.g., reached block 4 instead of expected block 6).test_construction_submit(2/7 runs):make_submit_and_wait_for_transactionused the default 60-second timeout (since theRosettaClientwas created without an explicit timeout). Under load, the transaction search couldn't find the submitted transaction within 60 seconds.Overall test timeouts (4/7 runs): Slow sync cascading across sequential test steps caused the bazel test timeout to be hit.
Fix
Increased
MAX_ATTEMPTSinwait_for_rosetta_blockfrom 20 to 60 (20s → 60s) to give Rosetta more time to sync blocks.Created the
RosettaClientwith an explicit 120-second timeout (viafrom_str_url_and_timeout) instead offrom_str_url(which had no timeout, defaulting to 60s internally). This givesmake_submit_and_wait_for_transactionsufficient time.Verification
All 3 parallel test runs passed consistently with low variance:
This PR was created following the steps in
.claude/skills/fix-flaky-tests/SKILL.md.