Reinstate watch count check in tests with busy/wait #123979

lukewhiting · 2025-03-04T10:03:02Z

This PR addresses a failing test initially flagged by #109679 by reverting the initial workaround created by #111435. It wraps the request in a busy/wait to deal with the stats temporarily being unavailable during a move of the watcher shards between nodes and the subsequent restarting of the watcher stats service.

Looking into how that's stats are generated, it's not so much that they reset on move but that they just take a short amount of time to repopulate as watches are loaded in again by the newly started instance of the watcher service on the node the shards have been moved to. Wrapping this request in a wait loop allows the tests to cope with this move.

I let the change soak by running the affected tests on a loop for 2 hours on my laptop and no failures occurred during that time.

Fixes ES-9782

Copilot

PR Overview

This PR reinstates a watch count check in integration tests by reintroducing a busy/wait pattern to handle temporary unavailability of watcher stats during shard moves and watcher service restarts.

Reintroduces assertBusy checks to ensure the watch count is 1L before further test execution.
Updates three test methods in WatchAckTests.java to use WatcherStatsRequestBuilder for polling watcher stats.

Reviewed Changes

File	Description
x-pack/plugin/watcher/src/internalClusterTest/java/org/elasticsearch/xpack/watcher/test/integration/WatchAckTests.java	Adds busy/wait assertions using WatcherStatsRequestBuilder to ensure proper watch count

Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.

elasticsearchmachine · 2025-03-04T10:04:10Z

Pinging @elastic/es-data-management (Team:Data Management)

masseyke

LGTM. We can always revert again if it causes problems.

elasticsearchmachine · 2025-03-06T09:01:26Z

💚 Backport successful

Status	Branch	Result
✅	9.0
✅	8.x

lukewhiting added >test-failure Triaged test failures from CI :Data Management/Watcher auto-backport Automatically create backport pull requests when merged v9.0.0 v8.19.0 v9.1.0 labels Mar 4, 2025

lukewhiting requested review from Copilot and masseyke March 4, 2025 10:03

Copilot AI reviewed Mar 4, 2025

View reviewed changes

elasticsearchmachine added Team:Data Management Meta label for data/management team needs:risk Requires assignment of a risk label (low, medium, blocker) labels Mar 4, 2025

lukewhiting added >test Issues or PRs that are addressing/adding tests and removed >test-failure Triaged test failures from CI needs:risk Requires assignment of a risk label (low, medium, blocker) labels Mar 4, 2025

Re-instate watch count check with busy/wait

6b6a869

lukewhiting force-pushed the es-9782-watcher-stats-reset branch from d86373c to 6b6a869 Compare March 4, 2025 10:57

masseyke approved these changes Mar 5, 2025

View reviewed changes

lukewhiting merged commit c702eb9 into elastic:main Mar 6, 2025
17 checks passed

This was referenced Mar 6, 2025

[9.0] Re-instate watch count check with busy/wait (#123979) #124187

Merged

[8.x] Re-instate watch count check with busy/wait (#123979) #124188

Merged

lukewhiting added a commit to lukewhiting/elasticsearch that referenced this pull request Mar 6, 2025

Re-instate watch count check with busy/wait (elastic#123979)

a73229c

elasticsearchmachine pushed a commit that referenced this pull request Mar 6, 2025

Re-instate watch count check with busy/wait (#123979) (#124187)

ac38ef2

elasticsearchmachine pushed a commit that referenced this pull request Mar 6, 2025

Re-instate watch count check with busy/wait (#123979) (#124188)

d4ffe34

lukewhiting deleted the es-9782-watcher-stats-reset branch March 6, 2025 10:08

georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Mar 11, 2025

Re-instate watch count check with busy/wait (elastic#123979)

78830f0

costin pushed a commit to costin/elasticsearch that referenced this pull request Mar 15, 2025

Re-instate watch count check with busy/wait (elastic#123979)

6936576

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reinstate watch count check in tests with busy/wait #123979

Reinstate watch count check in tests with busy/wait #123979

Uh oh!

lukewhiting commented Mar 4, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

elasticsearchmachine commented Mar 4, 2025

Uh oh!

masseyke left a comment

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Reinstate watch count check in tests with busy/wait #123979

Reinstate watch count check in tests with busy/wait #123979

Uh oh!

Conversation

lukewhiting commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

PR Overview

Reviewed Changes

Uh oh!

elasticsearchmachine commented Mar 4, 2025

Uh oh!

masseyke left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 6, 2025

💚 Backport successful

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lukewhiting commented Mar 4, 2025 •

edited

Loading