Skip to content

Mute flaky WorkloadManagementIT tests#20950

Merged
andrross merged 1 commit intoopensearch-project:mainfrom
andrross:wlmit-mute
Mar 20, 2026
Merged

Mute flaky WorkloadManagementIT tests#20950
andrross merged 1 commit intoopensearch-project:mainfrom
andrross:wlmit-mute

Conversation

@andrross
Copy link
Member

Recent failures:

Time test_name build_number
Mar 20, 2026 @ 11:23:14.479 testHighCPUInEnforcedMode {css:false} 72965
Mar 16, 2026 @ 10:48:27.610 testHighCPUInEnforcedMode {css:true} 72635
Feb 18, 2026 @ 20:55:00.772 testHighCPUInEnforcedMode {css:true} 71503
Feb 3, 2026 @ 16:09:27.278 testHighCPUInMonitorMode {css:true} 70913
Feb 3, 2026 @ 16:09:27.278 testHighMemoryInMonitorMode {css:true} 70913
Feb 3, 2026 @ 16:09:27.278 testHighCPUInMonitorMode {css:false} 70913
Feb 3, 2026 @ 16:09:27.278 testHighMemoryInMonitorMode {css:false} 70913
Dec 26, 2025 @ 16:55:00.390 testHighCPUInEnforcedMode {css:true} 69312
Dec 15, 2025 @ 15:36:15.851 testHighCPUInEnforcedMode {css:true} 68815
Dec 2, 2025 @ 19:00:07.265 testHighCPUInEnforcedMode {css:false} 68207
Nov 27, 2025 @ 09:55:11.983 testHighMemoryInEnforcedMode {css:true} 67964
Nov 20, 2025 @ 14:36:39.346 testHighCPUInEnforcedMode {css:true} 67665
Nov 19, 2025 @ 08:55:02.213 testHighCPUInMonitorMode {css:true} 67608
Nov 18, 2025 @ 20:55:00.866 testHighMemoryInEnforcedMode {css:false} 67565
Nov 18, 2025 @ 00:44:05.665 testHighMemoryInEnforcedMode {css:true} 67513
Nov 2, 2025 @ 12:55:00.423 testHighCPUInEnforcedMode {css:false} 66616
Nov 2, 2025 @ 06:55:00.161 testHighCPUInEnforcedMode {css:true} 66605
Nov 2, 2025 @ 06:55:00.161 testHighMemoryInMonitorMode {css:false} 66605
Nov 2, 2025 @ 06:55:00.161 testHighMemoryInEnforcedMode {css:false} 66605
Nov 2, 2025 @ 06:55:00.161 testHighCPUInMonitorMode {css:false} 66605
Nov 2, 2025 @ 06:55:00.161 testNoCancellation {css:false} 66605
Oct 31, 2025 @ 09:47:40.830 testHighCPUInEnforcedMode {css:false} 66532
Oct 30, 2025 @ 22:00:48.148 testHighCPUInEnforcedMode {css:false} 66502
Oct 27, 2025 @ 16:25:21.579 testHighCPUInEnforcedMode {css:true} 66317
Oct 24, 2025 @ 10:23:09.269 testHighCPUInEnforcedMode {css:false} 66162
Oct 24, 2025 @ 01:55:00.887 testHighMemoryInEnforcedMode {css:true} 66129
Oct 16, 2025 @ 15:55:00.959 testHighCPUInEnforcedMode {css:true} 65655
Oct 13, 2025 @ 17:58:16.185 testHighMemoryInEnforcedMode {css:true} 65458
Oct 13, 2025 @ 10:32:53.688 testHighCPUInEnforcedMode {css:true} 65427
Oct 8, 2025 @ 11:27:12.471 testHighCPUInEnforcedMode {css:false} 65191
Sep 30, 2025 @ 11:55:00.306 testHighCPUInEnforcedMode {css:false} 64781
Sep 29, 2025 @ 18:48:51.923 testHighCPUInEnforcedMode {css:false} 64691
Sep 28, 2025 @ 22:37:23.531 testHighCPUInEnforcedMode {css:false} 64559
Sep 27, 2025 @ 11:26:15.151 testHighMemoryInEnforcedMode {css:true} 64496
Sep 25, 2025 @ 09:22:26.007 testHighCPUInEnforcedMode {css:true} 64373
Sep 22, 2025 @ 19:21:26.437 testHighMemoryInEnforcedMode {css:false} 64108

Related Issues

Related to #17154

Check List

  • Functionality includes testing.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Recent failures:

Time                        | test_name                               | build_number
--------------------------- | --------------------------------------- | ------------
Mar 20, 2026 @ 11:23:14.479 | testHighCPUInEnforcedMode {css:false}   | 72965
Mar 16, 2026 @ 10:48:27.610 | testHighCPUInEnforcedMode {css:true}    | 72635
Feb 18, 2026 @ 20:55:00.772 | testHighCPUInEnforcedMode {css:true}    | 71503
Feb  3, 2026 @ 16:09:27.278 | testHighCPUInMonitorMode {css:true}     | 70913
Feb  3, 2026 @ 16:09:27.278 | testHighMemoryInMonitorMode {css:true}  | 70913
Feb  3, 2026 @ 16:09:27.278 | testHighCPUInMonitorMode {css:false}    | 70913
Feb  3, 2026 @ 16:09:27.278 | testHighMemoryInMonitorMode {css:false} | 70913
Dec 26, 2025 @ 16:55:00.390 | testHighCPUInEnforcedMode {css:true}    | 69312
Dec 15, 2025 @ 15:36:15.851 | testHighCPUInEnforcedMode {css:true}    | 68815
Dec  2, 2025 @ 19:00:07.265 | testHighCPUInEnforcedMode {css:false}   | 68207
Nov 27, 2025 @ 09:55:11.983 | testHighMemoryInEnforcedMode {css:true} | 67964
Nov 20, 2025 @ 14:36:39.346 | testHighCPUInEnforcedMode {css:true}    | 67665
Nov 19, 2025 @ 08:55:02.213 | testHighCPUInMonitorMode {css:true}     | 67608
Nov 18, 2025 @ 20:55:00.866 | testHighMemoryInEnforcedMode {css:false}| 67565
Nov 18, 2025 @ 00:44:05.665 | testHighMemoryInEnforcedMode {css:true} | 67513
Nov  2, 2025 @ 12:55:00.423 | testHighCPUInEnforcedMode {css:false}   | 66616
Nov  2, 2025 @ 06:55:00.161 | testHighCPUInEnforcedMode {css:true}    | 66605
Nov  2, 2025 @ 06:55:00.161 | testHighMemoryInMonitorMode {css:false} | 66605
Nov  2, 2025 @ 06:55:00.161 | testHighMemoryInEnforcedMode {css:false}| 66605
Nov  2, 2025 @ 06:55:00.161 | testHighCPUInMonitorMode {css:false}    | 66605
Nov  2, 2025 @ 06:55:00.161 | testNoCancellation {css:false}          | 66605
Oct 31, 2025 @ 09:47:40.830 | testHighCPUInEnforcedMode {css:false}   | 66532
Oct 30, 2025 @ 22:00:48.148 | testHighCPUInEnforcedMode {css:false}   | 66502
Oct 27, 2025 @ 16:25:21.579 | testHighCPUInEnforcedMode {css:true}    | 66317
Oct 24, 2025 @ 10:23:09.269 | testHighCPUInEnforcedMode {css:false}   | 66162
Oct 24, 2025 @ 01:55:00.887 | testHighMemoryInEnforcedMode {css:true} | 66129
Oct 16, 2025 @ 15:55:00.959 | testHighCPUInEnforcedMode {css:true}    | 65655
Oct 13, 2025 @ 17:58:16.185 | testHighMemoryInEnforcedMode {css:true} | 65458
Oct 13, 2025 @ 10:32:53.688 | testHighCPUInEnforcedMode {css:true}    | 65427
Oct  8, 2025 @ 11:27:12.471 | testHighCPUInEnforcedMode {css:false}   | 65191
Sep 30, 2025 @ 11:55:00.306 | testHighCPUInEnforcedMode {css:false}   | 64781
Sep 29, 2025 @ 18:48:51.923 | testHighCPUInEnforcedMode {css:false}   | 64691
Sep 28, 2025 @ 22:37:23.531 | testHighCPUInEnforcedMode {css:false}   | 64559
Sep 27, 2025 @ 11:26:15.151 | testHighMemoryInEnforcedMode {css:true} | 64496
Sep 25, 2025 @ 09:22:26.007 | testHighCPUInEnforcedMode {css:true}    | 64373
Sep 22, 2025 @ 19:21:26.437 | testHighMemoryInEnforcedMode {css:false}| 64108

Signed-off-by: Andrew Ross <andrross@amazon.com>
@andrross andrross requested a review from a team as a code owner March 20, 2026 20:36
@andrross andrross added skip-changelog disabled-test Issues that are used by an AwaitsFix annotation to temporarily disable a broken test labels Mar 20, 2026
@andrross
Copy link
Member Author

FYI @kaushalmahi12

@github-actions
Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 No relevant tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Same Bug URL

All four muted tests (testHighCPUInEnforcedMode, testHighCPUInMonitorMode, testHighMemoryInEnforcedMode, testHighMemoryInMonitorMode) point to the same bug URL (issue #17154). Based on the PR description, the failures appear to have different characteristics (CPU vs Memory, Enforced vs Monitor mode). It should be verified that a single issue truly covers all four test failures, or if separate tracking issues are needed.

@AwaitsFix(bugUrl = "https://github.com/opensearch-project/OpenSearch/issues/17154")
public void testHighCPUInEnforcedMode() throws InterruptedException {
    Settings request = Settings.builder().put(WorkloadManagementSettings.WLM_MODE_SETTING.getKey(), ENABLED).build();
    assertAcked(client().admin().cluster().prepareUpdateSettings().setPersistentSettings(request).get());
    WorkloadGroup workloadGroup = new WorkloadGroup(
        "name",
        new MutableWorkloadGroupFragment(
            MutableWorkloadGroupFragment.ResiliencyMode.ENFORCED,
            Map.of(ResourceType.CPU, 0.01, ResourceType.MEMORY, 0.01)
        )
    );
    updateWorkloadGroupInClusterState(PUT, workloadGroup);
    Exception caughtException = executeWorkloadGroupTask(CPU, workloadGroup.get_id());
    assertNotNull("SearchTask should have been cancelled with TaskCancelledException", caughtException);
    MatcherAssert.assertThat(caughtException, instanceOf(TaskCancelledException.class));
    updateWorkloadGroupInClusterState(DELETE, workloadGroup);
}

@AwaitsFix(bugUrl = "https://github.com/opensearch-project/OpenSearch/issues/17154")
public void testHighCPUInMonitorMode() throws InterruptedException {
    WorkloadGroup workloadGroup = new WorkloadGroup(
        "name",
        new MutableWorkloadGroupFragment(
            MutableWorkloadGroupFragment.ResiliencyMode.ENFORCED,
            Map.of(ResourceType.CPU, 0.01, ResourceType.MEMORY, 0.01)
        )
    );
    updateWorkloadGroupInClusterState(PUT, workloadGroup);
    Exception caughtException = executeWorkloadGroupTask(CPU, workloadGroup.get_id());
    assertNull(caughtException);
    updateWorkloadGroupInClusterState(DELETE, workloadGroup);
}

@AwaitsFix(bugUrl = "https://github.com/opensearch-project/OpenSearch/issues/17154")
public void testHighMemoryInEnforcedMode() throws InterruptedException {
    Settings request = Settings.builder().put(WorkloadManagementSettings.WLM_MODE_SETTING.getKey(), ENABLED).build();
    assertAcked(client().admin().cluster().prepareUpdateSettings().setPersistentSettings(request).get());
    WorkloadGroup workloadGroup = new WorkloadGroup(
        "name",
        new MutableWorkloadGroupFragment(MutableWorkloadGroupFragment.ResiliencyMode.ENFORCED, Map.of(ResourceType.MEMORY, 0.01))
    );
    updateWorkloadGroupInClusterState(PUT, workloadGroup);
    Exception caughtException = executeWorkloadGroupTask(MEMORY, workloadGroup.get_id());
    assertNotNull("SearchTask should have been cancelled with TaskCancelledException", caughtException);
    MatcherAssert.assertThat(caughtException, instanceOf(TaskCancelledException.class));
    updateWorkloadGroupInClusterState(DELETE, workloadGroup);
}

@AwaitsFix(bugUrl = "https://github.com/opensearch-project/OpenSearch/issues/17154")

@github-actions
Copy link
Contributor

❌ Gradle check result for a63eee6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

✅ Gradle check result for a63eee6: SUCCESS

@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.32%. Comparing base (5fb2c0a) to head (a63eee6).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #20950      +/-   ##
============================================
+ Coverage     73.23%   73.32%   +0.08%     
- Complexity    72489    72581      +92     
============================================
  Files          5819     5819              
  Lines        331352   331353       +1     
  Branches      47875    47875              
============================================
+ Hits         242675   242971     +296     
+ Misses        69145    68840     -305     
- Partials      19532    19542      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andrross andrross merged commit 3a6eb79 into opensearch-project:main Mar 20, 2026
46 of 49 checks passed
@andrross andrross deleted the wlmit-mute branch March 20, 2026 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disabled-test Issues that are used by an AwaitsFix annotation to temporarily disable a broken test skip-changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants