Virtual Shards Phase 2 - Filter and Extract by atris · Pull Request #20896 · opensearch-project/OpenSearch

atris · 2026-03-17T18:06:22Z

This change adds the Phase 2 virtual-shard extraction primitive.

It introduces IndexShard#extractVirtualShard(int, Path), backed by a Lucene filter-on-write extraction path (VirtualShardFilteredMergePolicy) that writes a standalone index containing only documents for the target virtual shard.

Routing parity is preserved by sharing virtual-shard computation with OperationRouting through VirtualShardRoutingHelper.computeVirtualShardId(...). IndexMetadata also caches index.number_of_virtual_shards for direct access.

What is included:

virtual-shard extraction utility in server merge layer
IndexShard integration entrypoint with validation
shared vShard computation helper used by routing and extraction
tests for extraction correctness, empty extraction, routing-field behavior, partitioned routing parity, override-map invariants, closed-shard behavior, bounds checks, and overwrite behavior for existing output paths

Out of scope:

transport orchestration/state machine
remap workflow APIs and cluster-level move sequencing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Add the Phase 2 storage primitive for virtual shards by extracting documents for a target vShard into a standalone Lucene index. - add VirtualShardFilteredMergePolicy (filter-on-write extraction) - add IndexShard#extractVirtualShard(int, Path) with validation - add VirtualShardRoutingHelper#computeVirtualShardId and use it from OperationRouting to keep routing/extraction parity Also address post-PR review findings for phase 1 PR. Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

github-actions · 2026-03-17T18:07:37Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit d737bf4.

Path	Line	Severity	Description
server/src/main/java/org/opensearch/index/shard/IndexShard.java	1694	medium	extractVirtualShard() accepts a caller-controlled java.nio.file.Path with no path validation, sandboxing, or authorization check beyond verifyNotClosed(). If this method is wired to a REST/transport handler in a subsequent commit, it becomes an arbitrary filesystem write primitive that could copy shard data to any path accessible by the JVM process.
server/src/main/java/org/opensearch/index/merge/VirtualShardFilteredMergePolicy.java	68	medium	isolateVirtualShard() opens FSDirectory.open(outputPath) with OpenMode.CREATE (line ~89), which silently overwrites any existing index at that path. Combined with the unchecked caller-supplied path, this is a potential data-overwrite vector should the API surface be extended; the absence of path canonicalization or a chroot-style check is anomalous for a production code path that touches the filesystem.
server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java	19	low	The @publicapi(since="3.6.0") annotation was removed from VirtualShardRoutingHelper without explanation. This reduces API-stability tracking and could be an attempt to make the class harder to audit as a public contract, though it may also reflect a legitimate decision to keep the class internal.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 0 | Medium: 2 | Low: 1

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

github-actions · 2026-03-18T09:58:12Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 9194a9f.

Path	Line	Severity	Description
server/src/main/java/org/opensearch/index/merge/VirtualShardFilteredMergePolicy.java	64	low	isolateVirtualShard() writes Lucene index data to a caller-supplied outputPath via FSDirectory.open() with no path validation or containment checks. If this method is surfaced through an admin or REST API without strict authorization controls, it could allow writing index data to arbitrary filesystem locations accessible by the JVM process. The concern is a design-level path traversal risk rather than intentional malice — the feature appears legitimately scoped to internal shard operations, but no call-site validation is enforced here.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 0 | Low: 1

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-18T09:59:57Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: Virtual Shard routing helper and IndexMetadata caching Relevant files: server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java server/src/main/java/org/opensearch/cluster/routing/OperationRouting.java server/src/test/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelperTests.java Sub-PR theme: Virtual Shard extraction primitive and IndexShard integration Relevant files: server/src/main/java/org/opensearch/index/merge/VirtualShardFilteredMergePolicy.java server/src/main/java/org/opensearch/index/shard/IndexShard.java server/src/test/java/org/opensearch/index/merge/VirtualShardFilteredMergePolicyTests.java server/src/test/java/org/opensearch/index/shard/IndexShardVirtualShardTests.java
⚡ Recommended focus areas for review Double liveDocs Check In `VirtualShardFilterReader`, the `liveDocs` Bits implementation re-checks `srcLive` at query time, but the membership array was already built by skipping deleted docs (lines 126-128). This double-check is redundant and could mask correctness issues if `sourceLiveDocs` changes between construction and use. The membership array should be the sole source of truth for the filtered live docs. this.liveDocs = new Bits() { @Override public boolean get(int index) { if (srcLive != null && !srcLive.get(index)) { return false; } return membershipFinal[index]; } @Override public int length() { return maxDoc; } }; Null CacheHelper `getCoreCacheHelper()` and `getReaderCacheHelper()` both return `null`. Returning `null` from these methods can cause `NullPointerException` in Lucene internals that rely on cache helpers (e.g., query caches, field caches). Consider delegating to the wrapped reader's cache helpers or documenting why null is safe here. public CacheHelper getCoreCacheHelper() { return null; } @Override public CacheHelper getReaderCacheHelper() { return null; } Skipped Segments In `isolateVirtualShard`, if a leaf reader cannot be unwrapped to a `CodecReader`, it is silently skipped with only a trace log. This means documents in that segment are silently excluded from the extraction output without any warning or error, potentially producing an incomplete virtual shard index. if (codecReader == null) { logger.trace("skipping leaf [{}]: unable to unwrap to CodecReader", leafReader.getClass()); continue; Routing Parity Risk `computeVirtualShardId` recomputes `effectiveRouting` and `partitionOffset` independently from `OperationRouting.generateShardId`. If `OperationRouting` is updated (e.g., changes to how `effectiveRouting` is derived), the two implementations can silently diverge, causing extraction to assign documents to wrong virtual shards. Consider consolidating the hash logic into a single shared method. public static int computeVirtualShardId(IndexMetadata indexMetadata, String id, String routing) { int numVirtualShards = indexMetadata.getNumberOfVirtualShards(); if (numVirtualShards <= 0) { throw new IllegalStateException("virtual shards are not enabled on index [" + indexMetadata.getIndex() + "]"); } String effectiveRouting = (routing != null) ? routing : id; int partitionOffset = indexMetadata.isRoutingPartitionedIndex() ? Math.floorMod(Murmur3HashFunction.hash(id), indexMetadata.getRoutingPartitionSize()) : 0; int hash = Murmur3HashFunction.hash(effectiveRouting) + partitionOffset; return Math.floorMod(hash, numVirtualShards); } Override Map Test `testExtractWithOverrideMap` builds an `overriddenMetadata` object but never uses it in an actual extraction call. The test only validates `resolvePhysicalShardId` with the override map, but does not verify that extraction with an overridden metadata produces the correct output. The test name is misleading relative to what is actually being tested. public void testExtractWithOverrideMap() throws Exception { IndexShard shard = newStartedShard(true, virtualShardSettings()); int docCount = 50; for (int i = 0; i < docCount; i++) { indexDoc(shard, "_doc", String.valueOf(i), "{\"value\":" + i + "}"); } shard.refresh("test"); flushShard(shard, true); int targetVShard = 0; int expectedCount = 0; IndexMetadata indexMetadata = shard.indexSettings.getIndexMetadata(); for (int i = 0; i < docCount; i++) { int vShardId = VirtualShardRoutingHelper.computeVirtualShardId(indexMetadata, String.valueOf(i), null); if (vShardId == targetVShard) { expectedCount++; } } Path tempPath = createTempDir("vshard_override"); shard.extractVirtualShard(targetVShard, tempPath); try (Directory dir = FSDirectory.open(tempPath); DirectoryReader reader = DirectoryReader.open(dir)) { assertEquals("Override map should not change vShard membership", expectedCount, reader.numDocs()); } int defaultPhysical = VirtualShardRoutingHelper.resolvePhysicalShardId(shard.indexSettings.getIndexMetadata(), targetVShard); assertEquals(0, defaultPhysical); Map<String, String> overrides = new HashMap<>(); overrides.put(String.valueOf(targetVShard), "3"); IndexMetadata.Builder overrideBuilder = IndexMetadata.builder("test_override") .settings( Settings.builder() .put(IndexMetadata.SETTING_VERSION_CREATED, Version.CURRENT) .put(IndexMetadata.SETTING_NUMBER_OF_VIRTUAL_SHARDS, TOTAL_VIRTUAL_SHARDS) ) .numberOfShards(5) .numberOfReplicas(1); overrideBuilder.putCustom(VirtualShardRoutingHelper.VIRTUAL_SHARDS_CUSTOM_METADATA_KEY, overrides); IndexMetadata overriddenMetadata = overrideBuilder.build(); int overriddenPhysical = VirtualShardRoutingHelper.resolvePhysicalShardId(overriddenMetadata, targetVShard); assertEquals(3, overriddenPhysical); closeShards(shard); }

github-actions · 2026-03-18T10:01:27Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Pass pre-computed effective routing to avoid inconsistency The original code used `effectiveRouting` (which defaults to `id` when `routing` is null) and `partitionOffset` already computed above. The new `computeVirtualShardId` call recomputes both internally, but passes the raw `routing` parameter instead of `effectiveRouting`. While `computeVirtualShardId` handles null routing by falling back to `id`, the `partitionOffset` is now computed twice — once here and once inside the helper — which is redundant but not incorrect. However, the `effectiveRouting` variable computed just before this block is now unused, which may cause a compiler warning and indicates a subtle inconsistency in the refactoring. server/src/main/java/org/opensearch/cluster/routing/OperationRouting.java [541] -int vShardId = VirtualShardRoutingHelper.computeVirtualShardId(indexMetadata, id, routing); +int vShardId = VirtualShardRoutingHelper.computeVirtualShardId(indexMetadata, id, effectiveRouting); Suggestion importance[1-10]: 3 __ Why: While `computeVirtualShardId` handles null routing by falling back to `id` internally, passing `effectiveRouting` instead of `routing` would be more consistent with the surrounding code. However, the behavior is functionally equivalent since both handle the null-routing case the same way, making this a minor style/consistency improvement.	Low
	Guard test assumption about empty virtual shard The test assumes that `(occupiedVShard + 1) % TOTAL_VIRTUAL_SHARDS` is an empty virtual shard, but with only one document this is likely true. However, this is not guaranteed — if by chance the next vShard also maps to a document (not possible with one doc, but fragile as a pattern). More importantly, the test does not assert that `emptyVShard` is actually different from `occupiedVShard`, which would be a problem if `TOTAL_VIRTUAL_SHARDS == 1`. Add an assertion to guard this assumption. server/src/test/java/org/opensearch/index/shard/IndexShardVirtualShardTests.java [107-115] -public void testExtractEmptyVirtualShard() throws Exception { - IndexShard shard = newStartedShard(true, virtualShardSettings()); - indexDoc(shard, "_doc", "1", "{\"value\":1}"); - shard.refresh("test"); - flushShard(shard, true); +IndexMetadata indexMetadata = shard.indexSettings.getIndexMetadata(); +int occupiedVShard = VirtualShardRoutingHelper.computeVirtualShardId(indexMetadata, "1", null); +int emptyVShard = (occupiedVShard + 1) % TOTAL_VIRTUAL_SHARDS; +assumeTrue("Need at least 2 virtual shards for this test", TOTAL_VIRTUAL_SHARDS > 1); +assertNotEquals(occupiedVShard, emptyVShard); - IndexMetadata indexMetadata = shard.indexSettings.getIndexMetadata(); - int occupiedVShard = VirtualShardRoutingHelper.computeVirtualShardId(indexMetadata, "1", null); - int emptyVShard = (occupiedVShard + 1) % TOTAL_VIRTUAL_SHARDS; - Suggestion importance[1-10]: 3 __ Why: Since `TOTAL_VIRTUAL_SHARDS` is a constant set to 20, the guard `assumeTrue("Need at least 2 virtual shards", TOTAL_VIRTUAL_SHARDS > 1)` would always pass, making it a no-op. The `assertNotEquals` is also trivially true with 20 virtual shards. This is a minor defensive improvement with negligible practical impact.	Low
	Document reader lifecycle to prevent resource confusion The `wrappedReaders` are `FilterCodecReader` instances wrapping readers from the `searcher` acquired outside this try block. The `IndexWriter` is closed before the `searcher` is closed (since the searcher's try-with-resources wraps this block), so the underlying readers remain valid. However, if `wrappedReaders` is empty, `writer.commit()` still runs and creates an empty index — this is intentional per the method contract, but the `forceMerge(1)` call inside the `if` block means an empty index won't be force-merged, which is fine. The real concern is that `wrappedReaders` are never explicitly closed; `FilterCodecReader` does not own the underlying reader, so this is acceptable, but it should be documented. server/src/main/java/org/opensearch/index/merge/VirtualShardFilteredMergePolicy.java [84-91] try (Directory outDir = FSDirectory.open(outputPath); IndexWriter writer = new IndexWriter(outDir, iwc)) { if (!wrappedReaders.isEmpty()) { writer.addIndexes(wrappedReaders.toArray(new CodecReader[0])); writer.forceMerge(1); } + // commit even if empty to produce a valid index at outputPath writer.commit(); } +// wrappedReaders wrap the searcher's readers and are not owned here; closed via searcher Suggestion importance[1-10]: 1 __ Why: This suggestion only adds comments to the existing code without changing any logic. The `improved_code` is functionally identical to the `existing_code`, making this a documentation-only change with minimal impact.	Low
Possible issue	Reset field visitor state between documents The `leftToVisit` counter is decremented every time `_id` or `_routing` fields are visited, but `needsField` can be called multiple times for the same field name across different documents in a batch. Since `VShardFieldVisitor` is reused per document in the loop, `leftToVisit` will reach 0 after the first document and cause early `STOP` for subsequent documents. A new `VShardFieldVisitor` instance should be created per document, or `leftToVisit` should be reset between documents. server/src/main/java/org/opensearch/index/merge/VirtualShardFilteredMergePolicy.java [203-212] -@Override -public Status needsField(FieldInfo fieldInfo) { - switch (fieldInfo.name) { - case IdFieldMapper.NAME: - case RoutingFieldMapper.NAME: - leftToVisit--; - return Status.YES; - default: - return leftToVisit == 0 ? Status.STOP : Status.NO; +for (int docId = 0; docId < maxDoc; docId++) { + if (sourceLiveDocs != null && !sourceLiveDocs.get(docId)) { + continue; + } + VShardFieldVisitor visitor = new VShardFieldVisitor(); // new instance per doc + storedFields.document(docId, visitor); + if (visitor.id == null) { + continue; + } + int computedVShard = VirtualShardRoutingHelper.computeVirtualShardId(indexMetadata, visitor.id, visitor.routing); + if (computedVShard == targetVShardId) { + membership[docId] = true; + count++; } } Suggestion importance[1-10]: 2 __ Why: Looking at the actual code in the PR, a new `VShardFieldVisitor` instance is already created per document in the loop (`VShardFieldVisitor visitor = new VShardFieldVisitor()` at line 129), so `leftToVisit` is reset for each document. The suggestion's concern about reuse is not applicable here, making this suggestion incorrect.	Low

github-actions · 2026-03-18T11:29:55Z

✅ Gradle check result for 9194a9f: SUCCESS

codecov · 2026-03-18T11:38:26Z

Codecov Report

❌ Patch coverage is 73.00000% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.32%. Comparing base (6562711) to head (9194a9f).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...h/index/merge/VirtualShardFilteredMergePolicy.java	67.94%	14 Missing and 11 partials ⚠️
...in/java/org/opensearch/index/shard/IndexShard.java	80.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #20896      +/-   ##
============================================
+ Coverage     73.31%   73.32%   +0.01%     
- Complexity    72506    72509       +3     
============================================
  Files          5819     5820       +1     
  Lines        331095   331192      +97     
  Branches      47829    47853      +24     
============================================
+ Hits         242749   242860     +111     
+ Misses        68831    68812      -19     
- Partials      19515    19520       +5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

atris added 2 commits March 17, 2026 20:13

Formatting fixes

d737bf4

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

atris requested review from a team, Bukhtawar, CEHENKLE, Rishikesh1159, anasalkouz, andrross, ashking94, cwperks, dbwiddis, gbbafna, jed326, kotwanikunal, mch2, msfroh, owaiskazi19, reta, sachinpkale, saratvemulapalli, shwetathareja and sohami as code owners March 17, 2026 18:06

atris added the skip-changelog label Mar 17, 2026

Merge origin/main into vshards_p2

9194a9f

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Virtual Shards Phase 2 - Filter and Extract#20896

Virtual Shards Phase 2 - Filter and Extract#20896
atris wants to merge 3 commits intoopensearch-project:mainfrom
atris:vshards_p2

atris commented Mar 17, 2026

Uh oh!

github-actions bot commented Mar 17, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

codecov bot commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

atris commented Mar 17, 2026

Uh oh!

github-actions bot commented Mar 17, 2026

PR Code Analyzer ❗

Uh oh!

github-actions bot commented Mar 18, 2026

PR Code Analyzer ❗

Uh oh!

github-actions bot commented Mar 18, 2026

PR Reviewer Guide 🔍

Uh oh!

github-actions bot commented Mar 18, 2026

PR Code Suggestions ✨

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

codecov bot commented Mar 18, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant