Skip to content

Commit d4fcd45

Browse files
poulokrbarker-dev
andauthored
docs: Add flaky-test plugin document (#24581)
Signed-off-by: Kelly Greco <kelly@swirldslabs.com> Signed-off-by: Kelly Greco <82919061+poulok@users.noreply.github.com> Co-authored-by: Roger Barker <roger.barker@swirldslabs.com>
1 parent 3d9574f commit d4fcd45

File tree

4 files changed

+105
-7
lines changed

4 files changed

+105
-7
lines changed

.github/workflows/docs/README.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
11
# README Workflows (Docs)
22

3-
| Document | Description |
4-
|--------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
5-
| [CITR Test Configuration](citr-test-config.md) | This document outlines the configuration settings for the CITR (Continuous Integration Test & Release) environment used in our workflows. The CITR environment is designed to facilitate automated testing and deployment processes. |
6-
| [Commit Contribution Flow](contribution_flow.md) | This document outlines the top-level flow of a commit from the point of PR creation to the point it is included in a release candidate tag (`build-XXXXX`) and the subsequent workflows that are triggered by the release candidate tag. |
7-
| [Forked PRs](forked_prs.md) | This document outlines the rules and limitations for running workflow jobs on pull requests that are opened from forked repositories. |
8-
| [Required Checks](required_checks.md) | This document outlines which required workflow jobs are able to run on PRs open from forked repositories and which are not. |
9-
| [Workflow Manifest](workflow-manifest.md) | This document outlines the manifest of all workflows in the `hiero-consensus-node` repository. |
3+
| Document | Description |
4+
|-------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
5+
| [CITR Test Configuration](citr-test-config.md) | This document outlines the configuration settings for the CITR (Continuous Integration Test & Release) environment used in our workflows. The CITR environment is designed to facilitate automated testing and deployment processes. |
6+
| [Commit Contribution Flow](contribution_flow.md) | This document outlines the top-level flow of a commit from the point of PR creation to the point it is included in a release candidate tag (`build-XXXXX`) and the subsequent workflows that are triggered by the release candidate tag. |
7+
| [Forked PRs](forked_prs.md) | This document outlines the rules and limitations for running workflow jobs on pull requests that are opened from forked repositories. |
8+
| [Required Checks](required_checks.md) | This document outlines which required workflow jobs are able to run on PRs open from forked repositories and which are not. |
9+
| [Workflow Manifest](workflow-manifest.md) | This document outlines the manifest of all workflows in the `hiero-consensus-node` repository. |
10+
| [Flaky Test Plugin](flaky-test-plugin/flaky-test-plugin-guide.md) | This document describes the behavior of the flaky test plugin and developer responsibilities. |
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# Flaky Test Detection Plugin — Developer Guide
2+
3+
## Overview
4+
5+
A new flaky test detection plugin is being enabled across CI workflows. This document explains how the plugin works, what changes you can expect in your CI experience, and what is expected of you when a flaky test is detected.
6+
7+
The plugin is provided by **Develocity** and operates as a Gradle plugin within the test execution step of CI workflows. When a test fails during the `gradlew test` process, the plugin automatically retries it up to a configured number of times. If the test succeeds on any retry, it is classified as **flaky** — the workflow run will still report success, and the flaky test will be tracked separately.
8+
9+
## Where the Plugin Is Enabled
10+
11+
The plugin is active in the following CI workflows:
12+
13+
- **Pull Request (PR) CI**
14+
- **Minimal Acceptance Test Suite (MATS) on main**
15+
- **eXtended Test Suite (XTS) executions**
16+
17+
> **Important:** The plugin operates exclusively within the test execution step — specifically, the step that runs the `gradlew test` command. It does **not** apply to any other workflow steps. Failures that occur outside of the test step (e.g., compilation, static analysis, or other non-test stages) are unaffected by this plugin and will continue to fail the workflow as they do today.
18+
19+
## How the Plugin Works
20+
21+
1. A test fails during the `gradlew test` step.
22+
2. The plugin automatically retries the failed test up to a configured number of times.
23+
3. **If the test passes on any retry**, it is marked as flaky. The overall workflow run is **not** failed.
24+
4. **If the test fails on all retries**, it is treated as a genuine failure. The workflow run fails as it normally would.
25+
26+
This means flaky tests will no longer block PR merges or cause MATS/XTS runs to report failure, as long as the test succeeds within the allowed retries. Note that passing MATS/XTS runs on the main branch will still send a Slack message confirming the run passed, but noting that flaky tests were detected.
27+
28+
> **Fail-safe:** If the number of flaky tests detected in a single run exceeds a configured threshold, the overall run will fail regardless of retry outcomes. This prevents situations where a large number of flaky tests mask a systemic issue.
29+
30+
## What Happens When a Flaky Test Is Detected
31+
32+
The behavior depends on where the flaky test was detected.
33+
34+
### On a Pull Request
35+
36+
A comment is left on the PR by the `github-actions` bot. The content of the comment depends on whether the flaky test is already known:
37+
38+
- **Known flaky test (ticket already exists):** The comment includes a link to the existing ticket for informational purposes. No new ticket is created. The CI run is linked in the existing ticket for tracking purposes. No action is required from the PR author unless they believe their changes may have worsened the flake.
39+
40+
- **New flaky test (no existing ticket):** A new ticket is automatically created on the flaky test project board and linked in the PR comment. The CI run is also linked in the new ticket. The PR author is expected to **investigate whether the flakiness was introduced by their changes**. If the flake is unrelated to the PR, no further action is required from the author — the ticket will be triaged by the assigned team.
41+
42+
### On MATS / XTS
43+
44+
When a new flaky test is detected during a MATS or XTS run:
45+
46+
- A ticket is **automatically created** on the flaky test project board.
47+
- The ticket is **assigned to the manager** of the team responsible for that test category (e.g., HAPI tests are assigned to the HAPI team manager, Otter tests to the Otter team manager). Unit tests, which span multiple areas, are assigned to the Consensus and Foundation team manager.
48+
- A warning message is posted to **#continuous-integration-test-operations** on Slack.
49+
50+
If the flaky test already has an existing ticket, no new ticket is created. Instead, the MATS/XTS run is linked in the existing ticket for tracking purposes.
51+
52+
### On Dry Runs
53+
54+
The [MATS Dry Run workflow](https://github.com/hiero-ledger/hiero-consensus-node/blob/main/.github/workflows/docs/citr-test-config.md#mats) runs the same checks as the PR CI workflow but does **not** automatically create tickets for newly detected flaky tests.
55+
56+
The [XTS Dry Run workflow](https://github.com/hiero-ledger/hiero-consensus-node/blob/main/.github/workflows/docs/citr-test-config.md#xts) executes the same checks as the standard XTS does. This dry run workflow does **not** automatically create tickets for newly detected flaky tests.
57+
58+
If you kick off a dry run, it is **your responsibility** to:
59+
60+
1. Check the workflow results for any detected flaky tests.
61+
2. Investigate whether the flakiness is related to changes in the branch you ran the workflow against.
62+
3. Create a ticket manually if the flake is new and not caused by your changes.
63+
64+
## Ticket Details
65+
66+
All tickets created by the plugin (automatically or manually) follow this convention:
67+
68+
- **Title format:** `[Flaky Test] {class}#{method}`
69+
- Example: `[Flaky Test] org.hiero.otter.test.HappyPathTest#flakyTestA`
70+
- **Project board:** All flaky test tickets are tracked on a dedicated [project board](https://github.com/orgs/hiero-ledger/projects/50/views/1)
71+
72+
## PR Comment Examples
73+
74+
Below are examples of the comments the `github-actions` bot will leave on PRs.
75+
76+
**Known flaky test — informational only:**
77+
78+
![Known Flaky Test Comment](pr-comment-ticket-already-exists.png)
79+
80+
**New flaky test — action required:**
81+
82+
![New Flaky Test Comment](pr-comment-new-ticket-created.png)
83+
84+
## Scope and Limitations
85+
86+
- The plugin is a **Gradle plugin** that operates solely within the `gradlew test` process. It cannot detect or retry failures that occur outside of this step.
87+
- If a test depends on an external service and that service is unavailable, the test will fail and the plugin will retry it just like any other failure. These failures still need to be investigated using the same process you follow today — the plugin does not change the level of effort required for infrastructure-related failures that occur inside of the test step.
88+
89+
## Summary of Responsibilities
90+
91+
| Scenario | Ticket Created Automatically? | Who Investigates? |
92+
|------------------------------------|-------------------------------|------------------------------------------------------------|
93+
| New flaky test on a **PR** | Yes | PR author (to determine if their changes caused the flake) |
94+
| Known flaky test on a **PR** | No (existing ticket linked) | Assigned team manager |
95+
| New flaky test on **MATS / XTS** | Yes | Assigned team manager |
96+
| Known flaky test on **MATS / XTS** | No (existing ticket linked) | Assigned team manager |
97+
| New flaky test on a **Dry Run** | No | Person who initiated the dry run |
58.9 KB
Loading
35.4 KB
Loading

0 commit comments

Comments
 (0)