Fix flaky DerivedSourceLeafReaderTests#20904
Conversation
testWithRandomDocuments was flaky due to two issues: 1. NoMergePolicy blocked forceMerge(1), so randomized IndexWriterConfig settings that caused multiple flushes resulted in multiple segments instead of the expected one. 2. The test stored _source in the index, but DerivedSourceStoredFields injects _source via the source provider then delegates to the underlying stored fields. This caused the visitor to receive _source twice, with the second call using Lucene's internal buffer which could differ from the original bytes. Fix by removing NoMergePolicy so forceMerge(1) works, and not storing _source in the index, which matches the production use case where derived source synthesizes _source from other data. Signed-off-by: Andrew Ross <andrross@amazon.com>
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #20904 +/- ##
============================================
+ Coverage 73.32% 73.39% +0.06%
- Complexity 72272 72387 +115
============================================
Files 5797 5802 +5
Lines 330323 330405 +82
Branches 47676 47686 +10
============================================
+ Hits 242215 242497 +282
+ Misses 68663 68514 -149
+ Partials 19445 19394 -51 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
testWithRandomDocuments was flaky due to two issues:
NoMergePolicy blocked forceMerge(1), so randomized IndexWriterConfig settings that caused multiple flushes resulted in multiple segments instead of the expected one.
The test stored _source in the index, but DerivedSourceStoredFields injects _source via the source provider then delegates to the underlying stored fields. This caused the visitor to receive _source twice, with the second call using Lucene's internal buffer which could differ from the original bytes.
Fix by removing NoMergePolicy so forceMerge(1) works, and not storing _source in the index, which matches the production use case where derived source synthesizes _source from other data.
Related Issues
Resolves #20812
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.