DX-379 Enhance Flakeguard Console Output for Improved Developer Experience #1796

lukaszcl · 2025-04-23T09:58:16Z

Summary

This PR significantly refactors the flakeguard run command to improve output clarity, simplify execution flow, and fix configuration issues related to output directories. The goal is to provide a more user-friendly and robust testing experience.

Changes

Output Overhaul:
- Introduced a new outputManager for structured, step-by-step progress reporting (Prep, Initial Run, Retry, Summary).
- Replaced the spinner with clear textual updates.
- Consolidated all detailed gotestsum logs (initial and retry runs) into a single, final section for easier review.
- Updated command help text (Short, Long) and documentation to reflect the new behavior and exit codes (0: success, 1: fail/flaky, 2: error).
- Adjusted zerolog output to io.Discard to prevent duplicate logging to stderr alongside the structured buffer output.
Logic Simplification & Bug Fixes:
- Introduced a runState struct to manage configuration and results centrally.
- Merged the logic from handleReruns and handleNoReruns directly into the main Run function for a cleaner flow.
- Simplified the final test status determination (PASS, FAIL, FLAKY) and exit code logic.
- Improved handling of the go test <file.go> edge case to skip unreliable reruns and provide clearer warnings.
- Modified checkDependencies (go mod tidy) to log a warning and continue execution upon failure instead of exiting.
- Ensured getJSONOutputPaths returns absolute file paths.

Motivation

The previous output format of flakeguard run was less intuitive, mixing summary information with logs. The separation of run/rerun logic into helper functions also made the flow harder to follow. This refactor aims to:

Provide a clear, step-by-step view of the execution process.
Make it easier to find detailed test logs by placing them all at the end.
Improve the reliability of configuration handling, particularly for output directories.
Enhance the overall user experience of the command.

Rollout plan

After merging this PR we can do staged rollout.

Bump Flakeguard version in Flakeguard Nightly workflow https://github.com/smartcontractkit/chainlink/blob/develop/.github/workflows/flakeguard-nightly.yml
Bump Flakeguard version in Core MQ workflow https://github.com/smartcontractkit/chainlink/blob/develop/.github/workflows/ci-core.yml (https://github.com/smartcontractkit/chainlink/blob/develop/tools/bin/go_core_tests#L12 and https://github.com/smartcontractkit/chainlink/blob/develop/tools/bin/go_core_ccip_deployment_tests#L11)

Example logs


[INFO] (1/4) Preparing environment...
-------------------------------------
  Dependency check ('go mod tidy'): OK
  Preparation complete.

[INFO] (2/4) Running initial tests...
-------------------------------------
  Running test packages: github.com/smartcontractkit/chainlink/v2/core/utils/big_math, github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest
  Initial results saved to: /Users/lukasz/Downloads/flakeguard/main/test-results.json
  Initial run completed:
    - 5 total tests run (excluding skipped)
    - 3 passed
    - 2 failed

[INFO] (3/4) Retrying failed tests...
-------------------------------------
  Rerun results saved to: /Users/lukasz/Downloads/flakeguard/rerun/test-results.json
  Retry results:
    - TestFail: still FAIL
    - TestFailBigMath: now PASS (flaky)

[INFO] (4/4) Final summary
--------------------------
  Total tests run: 5
    - Final PASS: 3
    - Final FAIL: 1
    - FLAKY:      1

[ERROR] 1 stable failing test(s) found
[WARN] 1 flaky test(s) found
[ERROR] Exit code = 1 (failures or flaky tests detected)

============================================================
=== DETAILED LOGS FOR INITIAL RUN (Initial run count: 5) ===
============================================================

PASS core/utils/big_math.TestMax (0.00s)
=== RUN   TestFailBigMath
--- FAIL: TestFailBigMath (0.00s)
FAIL core/utils/big_math.TestFailBigMath (0.00s)
PASS core/utils/big_math.TestMin (0.00s)
PASS core/utils/big_math.TestAccumulate (0.00s)
FAIL core/utils/big_math

=== Failed
=== FAIL: core/utils/big_math TestFailBigMath (0.00s)

DONE 4 tests, 1 failure in 0.002s

------------------------------------------

PASS core/utils/big_math.TestMax (0.00s)
PASS core/utils/big_math.TestFailBigMath (0.00s)
PASS core/utils/big_math.TestMin (0.00s)
PASS core/utils/big_math.TestAccumulate (0.00s)
PASS core/utils/big_math

DONE 4 tests in 0.002s

------------------------------------------

PASS core/utils/big_math.TestMax (0.00s)
PASS core/utils/big_math.TestFailBigMath (0.00s)
PASS core/utils/big_math.TestMin (0.00s)
PASS core/utils/big_math.TestAccumulate (0.00s)
PASS core/utils/big_math

DONE 4 tests in 0.001s

------------------------------------------

PASS core/utils/big_math.TestMax (0.00s)
PASS core/utils/big_math.TestFailBigMath (0.00s)
PASS core/utils/big_math.TestMin (0.00s)
PASS core/utils/big_math.TestAccumulate (0.00s)
PASS core/utils/big_math

DONE 4 tests in 0.001s

------------------------------------------

PASS core/utils/big_math.TestMax (0.00s)
=== RUN   TestFailBigMath
--- FAIL: TestFailBigMath (0.00s)
FAIL core/utils/big_math.TestFailBigMath (0.00s)
PASS core/utils/big_math.TestMin (0.00s)
PASS core/utils/big_math.TestAccumulate (0.00s)
FAIL core/utils/big_math

=== Failed
=== FAIL: core/utils/big_math TestFailBigMath (0.00s)

DONE 4 tests, 1 failure in 0.002s

------------------------------------------

=== RUN   TestFail
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail
--- FAIL: TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest.TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest

=== Failed
=== FAIL: github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest TestFail (0.00s)
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail

DONE 1 tests, 1 failure in 0.002s

------------------------------------------

=== RUN   TestFail
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail
--- FAIL: TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest.TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest

=== Failed
=== FAIL: github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest TestFail (0.00s)
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail

DONE 1 tests, 1 failure in 0.002s

------------------------------------------

=== RUN   TestFail
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail
--- FAIL: TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest.TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest

=== Failed
=== FAIL: github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest TestFail (0.00s)
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail

DONE 1 tests, 1 failure in 0.001s

------------------------------------------

=== RUN   TestFail
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail
--- FAIL: TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest.TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest

=== Failed
=== FAIL: github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest TestFail (0.00s)
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail

DONE 1 tests, 1 failure in 0.001s

------------------------------------------

=== RUN   TestFail
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail
--- FAIL: TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest.TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest

=== Failed
=== FAIL: github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest TestFail (0.00s)
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail

DONE 1 tests, 1 failure in 0.001s

------------------------------------------


=============================================================
=== DETAILED LOGS FOR RETRY ATTEMPTS (3 retries per test) ===
=============================================================

=== RUN   TestFail
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail
--- FAIL: TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest.TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest

=== Failed
=== FAIL: github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest TestFail (0.00s)
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail

DONE 1 tests, 1 failure in 0.002s

------------------------------------------

=== RUN   TestFailBigMath
--- FAIL: TestFailBigMath (0.00s)
FAIL core/utils/big_math.TestFailBigMath (0.00s)
FAIL core/utils/big_math

=== Failed
=== FAIL: core/utils/big_math TestFailBigMath (0.00s)

DONE 1 tests, 1 failure in 0.001s

------------------------------------------

=== RUN   TestFailBigMath
--- FAIL: TestFailBigMath (0.00s)
FAIL core/utils/big_math.TestFailBigMath (0.00s)
FAIL core/utils/big_math

=== Failed
=== FAIL: core/utils/big_math TestFailBigMath (0.00s)

DONE 1 tests, 1 failure in 0.002s

------------------------------------------

=== RUN   TestFail
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail
--- FAIL: TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest.TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest

=== Failed
=== FAIL: github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest TestFail (0.00s)
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail

DONE 1 tests, 1 failure in 0.002s

------------------------------------------

PASS core/utils/big_math.TestFailBigMath (0.00s)
PASS core/utils/big_math

DONE 1 tests in 0.002s

------------------------------------------

=== RUN   TestFail
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail
--- FAIL: TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest.TestFail (0.00s)
FAIL github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest

=== Failed
=== FAIL: github.com/smartcontractkit/chainlink/deployment/ccip/changeset/testhelpers/feestest TestFail (0.00s)
    a_test.go:10: 
                Error Trace:    /Users/lukasz/Documents/smartcontractkit/chainlink-core/deployment/ccip/changeset/testhelpers/feestest/a_test.go:10
                Error:          Not equal: 
                                expected: 0
                                actual  : 1
                Test:           TestFail

DONE 1 tests, 1 failure in 0.001s

------------------------------------------

Below is a summarization created by an LLM (gpt-4-0125-preview). Be mindful of hallucinations and verify accuracy.

Why

The changes improve the FlakeGuard testing tool by enhancing its reporting capabilities, error handling, and overall usability. These modifications aim to provide clearer insights into test results, streamline the rerun process for failed tests, and offer a more structured output for easier analysis.

What

cmd/run.go:
- Added runState struct to hold run configuration and results.
- Removed dependency on github.com/briandowns/spinner.
- Added outputManager struct for managing output buffer and exit codes.
- Enhanced output format to include detailed logs for initial run and retry attempts.
- Improved error handling and reporting mechanism.
- Implemented more descriptive and structured command output including preparation, initial test run, retry of failed tests, and final summary.
- Added handling for command-line-arguments test packages to bypass rerun due to context loss.
- Introduced detailed reporting for each step of the test process and final test outcomes.
- Structured the test execution flow into preparation, initial run, retries, and summary.
- Included detailed logs for both initial and retry test runs.
main.go:
- Changed log output from os.Stderr to io.Discard to clean up console output.
runner/runner.go:
- Added RawOutputDir to Runner struct to specify directory for raw test output.
- Updated NewRunner function to accept rawOutputDir parameter.
- Adjusted runner configuration to properly handle raw output directory.
runner/runner_integration_test.go:
- Modified integration tests to support changes in the runner, particularly the addition of the raw output directory.
runner/runner_test.go:
- Adapted unit tests to account for the new runner configuration and output directory handling.

…, improve rerun logic, and update logging for clarity. Adjusted command output messages for better user guidance and added checks for command-line argument usage to prevent unreliable test reruns.

jmank88 · 2025-04-23T13:54:55Z

Do the test-results.json files contain all lines? As humans, we almost never want that. If we are going to need to look at files, we'd prefer non-json, or at least only the failure lines. Could we output those as well?

lukaszcl · 2025-04-25T09:01:35Z

@jmank88 test-results.json contain only metadata for Flakeguard - Splunk integration. We have a separate JSON file with all logs for failed tests. But, this PR focuses on console experience (e.g. when you click a link to a build on Github CI). Here I'm planning to show logs like you see in the PR description. @jmank88 let me know what you think about it!

Copilot

Pull Request Overview

This PR refactors the flakeguard run command to improve output clarity and execution flow while addressing configuration issues related to output directories.

Introduces a centralized runState and outputManager for structured logging and improved run/rerun handling.
Updates NewRunner calls in tests to include the new rawOutputDir parameter and refactors logging in main.go to avoid duplicate stderr output.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

File	Description
tools/flakeguard/runner/runner_test.go	Updated tests to pass the new rawOutputDir parameter as empty string.
tools/flakeguard/runner/runner_integration_test.go	Added temporary directory usage for raw output testing.
tools/flakeguard/runner/runner.go	Removed constant RawOutputDir and replaced it with instance-based configuration.
tools/flakeguard/main.go	Updated logger output to io.Discard to prevent duplicate logging.

Comments suppressed due to low confidence (1)

tools/flakeguard/runner/runner_test.go:189

[nitpick] Consider adding test cases that supply a non-empty rawOutputDir to verify that file path handling works correctly when an explicit directory is provided.

+                "",

erikburt · 2025-05-05T18:13:02Z

I've taken a look, I don't see anything blatantly wrong which is all I'm good for when it comes to golang.

Marking as ready for review.

erikburt · 2025-05-05T18:13:48Z

@jmank88 - does the above example output look good to you?

erikburt · 2025-05-05T19:52:04Z

@skudasov or @Tofel, can one of you review this PR on behalf of devex-tooling.

lukaszcl added 3 commits April 22, 2025 14:45

Print flay test logs in the console when no reruns

61bfc9d

Refactor Flakeguard test execution flow: enhance initial run handling…

782be4b

…, improve rerun logic, and update logging for clarity. Adjusted command output messages for better user guidance and added checks for command-line argument usage to prevent unreliable test reruns.

improve console output

ef4d6ed

lukaszcl changed the title ~~Dx 379~~ DX-379 Enhance Flakeguard Console Output for Improved Developer Experience Apr 23, 2025

Fix tests

7d25782

sebawo requested a review from Copilot April 25, 2025 10:48

Copilot AI reviewed Apr 25, 2025

View reviewed changes

sebawo requested a review from erikburt April 25, 2025 10:53

erikburt marked this pull request as ready for review May 5, 2025 18:13

erikburt requested a review from a team as a code owner May 5, 2025 18:13

erikburt approved these changes May 5, 2025

View reviewed changes

erikburt requested review from Tofel and skudasov May 5, 2025 19:51

skudasov approved these changes May 5, 2025

View reviewed changes

Merge branch 'main' into DX-379

21bbdce

erikburt merged commit bc66d27 into main May 6, 2025
58 checks passed

erikburt deleted the DX-379 branch May 6, 2025 20:12

erikburt mentioned this pull request Jun 6, 2025

chore: disable flakeguard for PRs smartcontractkit/chainlink#18051

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DX-379 Enhance Flakeguard Console Output for Improved Developer Experience #1796

DX-379 Enhance Flakeguard Console Output for Improved Developer Experience #1796

Uh oh!

lukaszcl commented Apr 23, 2025 •

edited by github-actions bot

Loading

Uh oh!

jmank88 commented Apr 23, 2025

Uh oh!

lukaszcl commented Apr 25, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

erikburt commented May 5, 2025

Uh oh!

erikburt commented May 5, 2025

Uh oh!

erikburt commented May 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

DX-379 Enhance Flakeguard Console Output for Improved Developer Experience #1796

DX-379 Enhance Flakeguard Console Output for Improved Developer Experience #1796

Uh oh!

Conversation

lukaszcl commented Apr 23, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Motivation

Rollout plan

Example logs

Why

What

Uh oh!

jmank88 commented Apr 23, 2025

Uh oh!

lukaszcl commented Apr 25, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

erikburt commented May 5, 2025

Uh oh!

erikburt commented May 5, 2025

Uh oh!

erikburt commented May 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lukaszcl commented Apr 23, 2025 •

edited by github-actions bot

Loading