Skip to content

Fix flaky testConsumerReinitializationWithNoInitialMessages test#20934

Merged
varunbharadwaj merged 1 commit intoopensearch-project:mainfrom
varunbharadwaj:vb/pbiflakyfix
Mar 20, 2026
Merged

Fix flaky testConsumerReinitializationWithNoInitialMessages test#20934
varunbharadwaj merged 1 commit intoopensearch-project:mainfrom
varunbharadwaj:vb/pbiflakyfix

Conversation

@varunbharadwaj
Copy link
Contributor

@varunbharadwaj varunbharadwaj commented Mar 20, 2026

Description

DefaultStreamPollerTests.testConsumerReinitializationWithNoInitialMessages test needs to wait for the consumer to be reinitialized before publishing the message, otherwise it'll read the message before and after reinitialization resulting in processing the message twice. This is not an issue, but results in flaky test. We update the test for deterministic result.

Related Issues

Resolves #20069

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
@github-actions github-actions bot added >test-failure Test failure from CI, local build, etc. autocut flaky-test Random test failure that succeeds on second run labels Mar 20, 2026
@varunbharadwaj varunbharadwaj marked this pull request as ready for review March 20, 2026 00:49
@varunbharadwaj varunbharadwaj requested a review from a team as a code owner March 20, 2026 00:49
@github-actions
Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Race Condition

The assertBusy check verifies that the consumer has been reinitialized by comparing object references (assertNotSame). However, there is a potential race condition where the new consumer object is created but not yet fully initialized or ready to process messages before the message is added. Validate that getConsumer() returning a different reference guarantees the consumer is fully operational and ready to consume messages.

assertBusy(() -> {
    IngestionShardConsumer currentConsumer = poller.getConsumer();
    assertNotNull(currentConsumer);
    assertNotSame(oldConsumer, currentConsumer);
}, 30, TimeUnit.SECONDS);

// Add a message after reinitialization is complete
messages.add("{\"_id\":\"1\",\"_source\":{\"name\":\"bob\", \"age\": 24}}".getBytes(StandardCharsets.UTF_8));

@github-actions
Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Validate old consumer is non-null before comparison

If poller.getConsumer() returns null before reinitialization has started,
oldConsumer will be null, and assertNotSame(null, null) would pass immediately even
if reinitialization hasn't actually occurred. You should also assert that
currentConsumer is not null AND that the reinitialization has truly completed by
checking that oldConsumer was non-null before comparing, or by adding an additional
state check.

server/src/test/java/org/opensearch/indices/pollingingest/DefaultStreamPollerTests.java [699-707]

 IngestionShardConsumer oldConsumer = poller.getConsumer();
+assertNotNull("Old consumer should not be null before reinitialization", oldConsumer);
 IngestionSource mockIngestionSource = new IngestionSource.Builder("test").build();
 poller.requestConsumerReinitialization(mockIngestionSource);
 
 assertBusy(() -> {
     IngestionShardConsumer currentConsumer = poller.getConsumer();
     assertNotNull(currentConsumer);
     assertNotSame(oldConsumer, currentConsumer);
 }, 30, TimeUnit.SECONDS);
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies a potential edge case where oldConsumer could be null, causing assertNotSame(null, null) to pass trivially without verifying actual reinitialization. Adding an assertNotNull check on oldConsumer before the reinitialization request would make the test more robust and meaningful.

Low

@github-actions
Copy link
Contributor

✅ Gradle check result for a1db51f: SUCCESS

@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.31%. Comparing base (9f5d0e1) to head (a1db51f).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##               main   #20934    +/-   ##
==========================================
  Coverage     73.30%   73.31%            
- Complexity    72484    72567    +83     
==========================================
  Files          5819     5819            
  Lines        331155   331352   +197     
  Branches      47840    47875    +35     
==========================================
+ Hits         242769   242945   +176     
- Misses        68876    68911    +35     
+ Partials      19510    19496    -14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@varunbharadwaj varunbharadwaj merged commit 06539a4 into opensearch-project:main Mar 20, 2026
46 of 48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

autocut flaky-test Random test failure that succeeds on second run skip-changelog >test-failure Test failure from CI, local build, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[AUTOCUT] Gradle Check Flaky Test Report for DefaultStreamPollerTests

2 participants