Optimize build_test.sh to not rereun tests on flake failure #1065

JonathanOppenheimer · 2025-07-23T19:22:32Z

Why this should be merged

Currently, if a single test fails, the entire test-suite is reran (up to 4 possible times). This is extremely inefficient as if a flaky test fails, the test suite will rerun, but then maybe a different flaky test fails, so the whole test-suite is rerun, etc. This leads to long time waiting for CI jobs to pass in GitHub, which is a large waste of time.

The prior claim that re-running all prior tests was necessary is copied below:

Note the absence of unexpected failures cannot be indicative that we only need to run the tests that failed, or example a test may panic and cause subsequent tests in that package to not run.

However, there may not be an unexpected failure that causes subsequent tests not to run -- if a single flaky test fails there is no need to rerun the whole test-suite. Additionally, even if there is a panic that causes subsequent tests not to run, there's still no reason to re-run the whole test-suite -- just the tests that failed, and the tests that never ran.

Scenario	Before Behavior	After Behavior
All tests pass	Exit immediately	Exit immediately
Unexpected failures	Exit immediately	Exit immediately
All tests run, only known flakes fail	Rerun ALL tests	Retry only failed tests
Some tests don't run due to panics	Rerun ALL tests	Rerun all tests in packages in which tests panicked + failed flaky tests

How this works

Get list of all expected tests for the package using and store for comparison
Run tests with go test -json -shuffle=on for structured output
Parse test output to extract failed tests using regex patterns for both failures and panics
Compare expected vs ran tests to find tests that didn't run due to panics
Categorize failures into known flakes vs unexpected failures
Depending on whether tests failed, or tests were skipped, target retry relevant tests.

How this was tested

This modifies the test framework directly.

Need to be documented?

No

Need to update RELEASES.md?

No

JonathanOppenheimer · 2025-07-24T20:45:41Z

scripts/build_test.sh

+# shellcheck disable=SC2046
+go test -shuffle=on ${race:-} -timeout="${TIMEOUT:-600s}" -coverprofile=coverage.out -covermode=atomic "$@" $(go list ./... | grep -v github.com/ava-labs/coreth/tests) | tee test.out || command_status=$?


Not sure if it's possible to write this line so it both
A. works
B. passes lint

How about separating creation of the array from its use?

mapfile -t pkgs < <(go list ./... | grep -v github.com/ava-labs/coreth/tests) go test -shuffle=on "${race:-}" -timeout="${TIMEOUT:-600s}" -coverprofile=coverage.out -covermode=atomic "$@" "${pkgs[@]}" | tee test.\ out || command_status=$?

If only! Writing this script was actually quite difficult because while bash is a decent programming language (imo). mapfile was added in Bash 4.0 and all supported macOS releases (including the ones behind GitHub Actions’ macos‑latest runner) still ship with Bash 3.2.x—currently 3.2.57—because Apple never adopted the GPL v3‑licensed Bash 4+ and instead moved its interactive shell to zsh while leaving /bin/bash frozen at 3.2.57
(see https://jmmv.dev/2019/11/macos-bash-baggage.html)

Thus all of our testing scripts need to be compatible with bash circa 2014, and I was hindered from using a lot of newer bash features that would have made this easier to write.

If you endorse it, I can refactor the whole testing script to use Python (or GoLang) instead of bash but it would be a much more significant change.

There is no obligation to be compatible with macos's ancient bash version. In avalanchego, we explicitly require a modern bash. One way to ensure the use of modern bash could be to run the command with scripts/dev_shell.sh, since the nix shell guarantees a compatible bash version.

Interesting! That's definitely something that is worth bringing over to the evm repositories in my opinion.

scripts/known_flakes.txt

maru-ava · 2025-08-06T00:17:49Z

Why did you decide to write this in bash? The complexity involved would really benefit from a real programming language (i.e. golang).

JonathanOppenheimer · 2025-08-06T14:52:11Z

Why did you decide to write this in bash? The complexity involved would really benefit from a real programming language (i.e. golang).

Not really any particular reason besides sticking with the current structure (order remains the same, run_task.sh and so on). If you find this overly complex, I can rewrite it in Python (or golang is no Python)

maru-ava · 2025-08-06T20:31:06Z

Not really any particular reason besides sticking with the current structure (order remains the same, run_task.sh and so on). If you find this overly complex, I can rewrite it in Python (or golang is no Python)

I'm not going to block merge of this - that's up to the coreth maintainers - so these comments are more food for thought when considering future changes.

Python would be a fine choice...if only our pool of reviewers was well-stocked with python experts. Safer to choose a language for which both expertise and the required toolchain are readily available.
Non-trivial bash scripting is tech debt as soon as it is written. It's at best challenging to both test and review for anyone but a bash expert (and we have few of those). Future maintenance is thus guaranteed more costly than the same functionality implemented in a language that maintainers have expertise in (i.e. golang).
There is nothing preventing a non-bash implementation in the current structure. The indirection of the taskfile means task build-test could be invoking golang as easily as bash without having to change CI or user expectation.

JonathanOppenheimer added 2 commits July 23, 2025 15:06

rewrite test retry logic

7697a6c

only rerun failed + non-run tests

0f1e686

JonathanOppenheimer requested a review from a team as a code owner July 23, 2025 19:22

JonathanOppenheimer added the testing Anything testing-related label Jul 23, 2025

JonathanOppenheimer marked this pull request as draft July 23, 2025 19:23

JonathanOppenheimer added 25 commits July 23, 2025 15:28

rewrite get all tests for package level

d5f0d87

try retry at individual package level

43f77e0

add debug

4ec0caf

rewrite get_all_tests

eeb1a25

enhancements

df4d5f2

fix RAN TESTS

ffbe3e4

hmmm

5db5447

try this

93d7334

fix env vars

b7f68e6

change coreth path

b0fb02f

more efficient get test list

aac7eaf

add echo statements

6b3e797

ADSFADSF

998bbdf

typo

388c630

Merge branch 'master' into rewrite-test-retry-logic

024ab2a

restore test coverage

5d18f5f

simplify

3cd402a

got

3f28633

no eit

5226e5a

add -v

3496271

use json

e476499

gotestsum

7d38556

RESET

528c57c

try rerun at the package level

fa233b2

lint errors

9a6f4fa

JonathanOppenheimer added 2 commits July 24, 2025 16:27

seperate paths

a6893d0

Merge branch 'master' into rewrite-test-retry-logic

50d090a

JonathanOppenheimer marked this pull request as ready for review July 24, 2025 20:39

JonathanOppenheimer commented Jul 24, 2025

View reviewed changes

save package for test through mapping:

3c46681

alarso16 reviewed Jul 25, 2025

View reviewed changes

scripts/known_flakes.txt Outdated Show resolved Hide resolved

JonathanOppenheimer added 9 commits July 25, 2025 12:22

remove non-flaky test

70a34c8

Merge branch 'master' into rewrite-test-retry-logic

58ca623

Merge branch 'master' into rewrite-test-retry-logic

48d8f60

See bash version

54ab582

try other bash

d78e10b

comply with old bash version

ccd8f70

Merge branch 'master' into rewrite-test-retry-logic

1dc934c

Merge branch 'master' into rewrite-test-retry-logic

dd94616

Merge branch 'master' into rewrite-test-retry-logic

5c8fdca

JonathanOppenheimer requested a review from a team as a code owner August 5, 2025 18:15

JonathanOppenheimer requested a review from maru-ava August 5, 2025 20:06

Merge branch 'master' into rewrite-test-retry-logic

b68485a

Merge branch 'master' into rewrite-test-retry-logic

2a6fdb9

JonathanOppenheimer added 3 commits August 11, 2025 11:02

Merge branch 'master' into rewrite-test-retry-logic

db3a694

Merge branch 'master' into rewrite-test-retry-logic

70512e3

Merge branch 'master' into rewrite-test-retry-logic

785468b

JonathanOppenheimer self-assigned this Aug 27, 2025

JonathanOppenheimer mentioned this pull request Oct 3, 2025

Determine strategy and resolve flaky tests. ava-labs/subnet-evm#1774

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize build_test.sh to not rereun tests on flake failure #1065

Optimize build_test.sh to not rereun tests on flake failure #1065

JonathanOppenheimer commented Jul 23, 2025 •

edited

Loading

Uh oh!

JonathanOppenheimer Jul 24, 2025

Uh oh!

maru-ava Aug 6, 2025

Uh oh!

JonathanOppenheimer Aug 6, 2025

Uh oh!

maru-ava Aug 6, 2025 •

edited

Loading

Uh oh!

JonathanOppenheimer Aug 7, 2025

Uh oh!

Uh oh!

maru-ava commented Aug 6, 2025

Uh oh!

JonathanOppenheimer commented Aug 6, 2025

Uh oh!

maru-ava commented Aug 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# shellcheck disable=SC2046
		go test -shuffle=on ${race:-} -timeout="${TIMEOUT:-600s}" -coverprofile=coverage.out -covermode=atomic "$@" $(go list ./... \| grep -v github.com/ava-labs/coreth/tests) \| tee test.out \|\| command_status=$?

Optimize build_test.sh to not rereun tests on flake failure #1065

Are you sure you want to change the base?

Optimize build_test.sh to not rereun tests on flake failure #1065

Conversation

JonathanOppenheimer commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why this should be merged

How this works

How this was tested

Need to be documented?

Need to update RELEASES.md?

Uh oh!

JonathanOppenheimer Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

maru-ava Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

JonathanOppenheimer Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

maru-ava Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JonathanOppenheimer Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

maru-ava commented Aug 6, 2025

Uh oh!

JonathanOppenheimer commented Aug 6, 2025

Uh oh!

maru-ava commented Aug 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JonathanOppenheimer commented Jul 23, 2025 •

edited

Loading

maru-ava Aug 6, 2025 •

edited

Loading