Virtual Shards Phase 2 - Filter and Extract#20896
Virtual Shards Phase 2 - Filter and Extract#20896atris wants to merge 3 commits intoopensearch-project:mainfrom
Conversation
Add the Phase 2 storage primitive for virtual shards by extracting documents for a target vShard into a standalone Lucene index. - add VirtualShardFilteredMergePolicy (filter-on-write extraction) - add IndexShard#extractVirtualShard(int, Path) with validation - add VirtualShardRoutingHelper#computeVirtualShardId and use it from OperationRouting to keep routing/extraction parity Also address post-PR review findings for phase 1 PR. Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit d737bf4.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 9194a9f.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #20896 +/- ##
============================================
+ Coverage 73.31% 73.32% +0.01%
- Complexity 72506 72509 +3
============================================
Files 5819 5820 +1
Lines 331095 331192 +97
Branches 47829 47853 +24
============================================
+ Hits 242749 242860 +111
+ Misses 68831 68812 -19
- Partials 19515 19520 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This change adds the Phase 2 virtual-shard extraction primitive.
It introduces IndexShard#extractVirtualShard(int, Path), backed by a Lucene filter-on-write extraction path (VirtualShardFilteredMergePolicy) that writes a standalone index containing only documents for the target virtual shard.
Routing parity is preserved by sharing virtual-shard computation with OperationRouting through VirtualShardRoutingHelper.computeVirtualShardId(...). IndexMetadata also caches index.number_of_virtual_shards for direct access.
What is included:
Out of scope:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.