Skip to content

Add Initial Routing and Metadata Groundwork for Virtual Shards#20745

Merged
atris merged 6 commits intoopensearch-project:mainfrom
atris:vshards_p1
Mar 14, 2026
Merged

Add Initial Routing and Metadata Groundwork for Virtual Shards#20745
atris merged 6 commits intoopensearch-project:mainfrom
atris:vshards_p1

Conversation

@atris
Copy link
Contributor

@atris atris commented Feb 27, 2026

This PR adds the initial routing and metadata groundwork for virtual shards.

When enabled, routing now uses a virtual shard id (vShardId) before resolving to a physical shard. This separates hash-space size from physical shard count and prepares the path for future shard movement workflows.

What's Included

  • New static index setting: index.number_of_virtual_shards (default -1, disabled).
  • Routing update in OperationRouting.generateShardId:
    • virtual-shard path when enabled,
    • existing behavior unchanged when disabled.
  • New VirtualShardRoutingHelper for vShardId -> physical shard resolution.
  • Optional per-index override support via virtual_shards_routing custom metadata.
  • Test coverage for:
    • enabled/disabled routing behavior,
    • validation rules,
    • override and fallback behavior.

Out of scope

  • Side-car segment extraction flow.
  • Transport/state-orchestration for managing override lifecycle.

Resolves #18809

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@atris atris requested a review from a team as a code owner February 27, 2026 18:15
@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request ShardManagement:Placement labels Feb 27, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 27, 2026

PR Reviewer Guide 🔍

(Review updated until commit d13a28c)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Add virtual shard metadata setting and resolution helper

Relevant files:

  • server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java
  • server/src/main/java/org/opensearch/common/settings/IndexScopedSettings.java
  • server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java
  • server/src/test/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelperTests.java

Sub-PR theme: Integrate virtual shard routing into OperationRouting

Relevant files:

  • server/src/main/java/org/opensearch/cluster/routing/OperationRouting.java
  • server/src/test/java/org/opensearch/cluster/routing/VirtualShardOperationRoutingTests.java

⚡ Recommended focus areas for review

Inefficient Setting Read

getNumberOfVirtualShards() reads from settings via getAsInt on every call instead of caching the value during construction (like numberOfShards and numberOfReplicas are cached). This could cause performance issues in hot routing paths.

public int getNumberOfVirtualShards() {
    return settings.getAsInt(SETTING_NUMBER_OF_VIRTUAL_SHARDS, -1);
}
Silent Override Failure

When an override value is invalid (out of bounds or not a number), the code silently falls back to range-based routing with only a trace-level log. This could cause unexpected routing behavior in production without any visible warning. Consider logging at WARN level or surfacing this as a validation error when overrides are set.

if (overrides != null) {
    String pShardIdStr = overrides.get(String.valueOf(vShardId));
    if (pShardIdStr != null) {
        try {
            int pShardId = Integer.parseInt(pShardIdStr);
            if (pShardId >= 0 && pShardId < numPhysicalShards) {
                return pShardId;
            }
            logger.trace("Invalid override value [{}] for vShard [{}]: out of bounds", pShardId, vShardId);
        } catch (NumberFormatException e) {
            logger.trace("Invalid override value [{}] for vShard [{}]: not a number", pShardIdStr, vShardId);
        }
    }
Partition Offset Interaction

When virtual shards are enabled with a partitioned index, the partitionOffset is added to the routing hash before computing vShardId. This means the same routing key can map to different virtual shards depending on the document ID, which may be intentional but could lead to uneven distribution across virtual shards. The interaction between virtual shards and routing partitions should be explicitly documented or tested for distribution uniformity.

int numVirtualShards = indexMetadata.getNumberOfVirtualShards();
if (numVirtualShards != -1) {
    final int hash = Murmur3HashFunction.hash(effectiveRouting) + partitionOffset;
    int vShardId = Math.floorMod(hash, numVirtualShards);
    return VirtualShardRoutingHelper.resolvePhysicalShardId(indexMetadata, vShardId);
}
Missing numVirtualShards == -1 Check

The validation in resolvePhysicalShardId checks numVirtualShards < numPhysicalShards, but when numVirtualShards == -1 (disabled), the condition -1 < numPhysicalShards is true and throws. However, the error message says "must be enabled and be a multiple" which is correct, but the check could be made more explicit by first checking numVirtualShards == -1 separately for clarity and to avoid relying on the side-effect of the less-than comparison.

if (numVirtualShards < numPhysicalShards || numVirtualShards % numPhysicalShards != 0) {
    throw new IllegalArgumentException(
        "Virtual shards must be enabled and be a multiple of the number of physical shards to resolve routing."
    );
}

@github-actions
Copy link
Contributor

github-actions bot commented Feb 27, 2026

PR Code Suggestions ✨

Latest suggestions up to d13a28c

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Guard against zero or negative virtual shard count

The validation check numVirtualShards < numPhysicalShards does not correctly handle
the disabled case where numVirtualShards == -1. When virtual shards are disabled
(-1), the condition -1 < numPhysicalShards is true, so the error message "must be
enabled" is technically correct, but the check should explicitly handle the disabled
case separately for clarity and correctness. Additionally, if numVirtualShards == 0,
the modulo check would throw an ArithmeticException (division by zero) before the
validation error.

server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java [39-43]

 int numVirtualShards = indexMetadata.getNumberOfVirtualShards();
 int numPhysicalShards = indexMetadata.getNumberOfShards();
 
-if (numVirtualShards < numPhysicalShards || numVirtualShards % numPhysicalShards != 0) {
+if (numVirtualShards <= 0 || numVirtualShards < numPhysicalShards || numVirtualShards % numPhysicalShards != 0) {
     throw new IllegalArgumentException(
         "Virtual shards must be enabled and be a multiple of the number of physical shards to resolve routing."
     );
 }
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that numVirtualShards == 0 would cause an ArithmeticException in the modulo check before the validation error. Adding numVirtualShards <= 0 as the first condition prevents this edge case and makes the disabled check more explicit. However, the setting's minimum value is -1 and the IndexMetadata.build() validation already prevents 0 from being a valid value in practice.

Low
General
Cache virtual shard count as a field

getNumberOfVirtualShards() reads directly from settings on every call, unlike
getNumberOfShards() which returns a cached field numberOfShards. This is
inconsistent and may cause performance issues if called frequently in hot paths like
shard routing. The value should be cached as a field during construction, similar to
how numberOfShards and numberOfReplicas are handled.

server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java [1350-1352]

+// In the class fields section, add:
+private final int numberOfVirtualShards;
+
+// In the constructor, initialize it:
+this.numberOfVirtualShards = INDEX_NUMBER_OF_VIRTUAL_SHARDS_SETTING.get(settings);
+
+// Then the getter becomes:
 public int getNumberOfVirtualShards() {
-    return settings.getAsInt(SETTING_NUMBER_OF_VIRTUAL_SHARDS, -1);
+    return numberOfVirtualShards;
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion is valid — getNumberOfVirtualShards() reads from settings on every call while other similar values like numberOfShards are cached as fields. Caching it would be more consistent and performant in hot routing paths. However, the improved_code shows a structural change across multiple locations rather than a direct snippet replacement.

Low
Ensure hash arithmetic handles integer overflow safely

The hash computation here re-hashes effectiveRouting independently, but the existing
calculateScaledShardId path uses effectiveRouting with the same partitionOffset.
However, the virtual shard path computes Murmur3HashFunction.hash(effectiveRouting)
+ partitionOffset while the test in testGenerateVirtualShardId uses only
Murmur3HashFunction.hash(routing) without partitionOffset for the non-partitioned
case. This is consistent since partitionOffset is 0 for non-partitioned indices. The
real concern is that Murmur3HashFunction.hash(effectiveRouting) + partitionOffset
could overflow int range when partitionOffset is large, potentially producing
unexpected results. Use Math.floorMod on the sum to ensure safe modulo behavior.

server/src/main/java/org/opensearch/cluster/routing/OperationRouting.java [525-530]

+int numVirtualShards = indexMetadata.getNumberOfVirtualShards();
+if (numVirtualShards != -1) {
+    final int hash = Murmur3HashFunction.hash(effectiveRouting) + partitionOffset;
+    int vShardId = Math.floorMod(hash, numVirtualShards);
+    return VirtualShardRoutingHelper.resolvePhysicalShardId(indexMetadata, vShardId);
+}
 
-
Suggestion importance[1-10]: 1

__

Why: The improved_code is identical to the existing_code, meaning no actual change is proposed. The suggestion discusses potential integer overflow but doesn't provide a concrete fix, making it unhelpful.

Low

Previous suggestions

Suggestions up to commit d13a28c
CategorySuggestion                                                                                                                                    Impact
Possible issue
Guard against zero or negative virtual shard count

The validation check uses numVirtualShards < numPhysicalShards but should also
handle the case where numVirtualShards == -1 (disabled). When virtual shards are
disabled, numVirtualShards is -1, which is less than any positive numPhysicalShards,
so the error message "must be enabled and be a multiple" is correct but the
condition could be made more explicit. More critically, if numVirtualShards is 0,
the modulo check numVirtualShards % numPhysicalShards would be 0 (passing), but then
division by zero would occur later. Add an explicit check for numVirtualShards <= 0.

server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java [39-43]

-if (numVirtualShards < numPhysicalShards || numVirtualShards % numPhysicalShards != 0) {
+if (numVirtualShards <= 0 || numVirtualShards < numPhysicalShards || numVirtualShards % numPhysicalShards != 0) {
     throw new IllegalArgumentException(
         "Virtual shards must be enabled and be a multiple of the number of physical shards to resolve routing."
     );
 }
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that numVirtualShards == 0 would pass the current check but cause a division by zero at line 63. Adding numVirtualShards <= 0 makes the guard more robust. However, the setting already enforces a minimum of -1 via Setting.intSetting(..., -1, -1, ...), so 0 is technically possible but unlikely in practice.

Low
General
Cache virtual shard count as a field

This method reads directly from settings each time it is called, while
numberOfShards and numberOfReplicas are stored as fields and read once during
construction. For consistency and performance (especially since this value is
accessed during every shard routing operation), numberOfVirtualShards should be
stored as a field and initialized in the constructor/builder, similar to
numberOfShards.

server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java [1350-1352]

+// Add field: private final int numberOfVirtualShards;
+// Initialize in constructor: this.numberOfVirtualShards = INDEX_NUMBER_OF_VIRTUAL_SHARDS_SETTING.get(settings);
+
 public int getNumberOfVirtualShards() {
-    return settings.getAsInt(SETTING_NUMBER_OF_VIRTUAL_SHARDS, -1);
+    return numberOfVirtualShards;
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion is valid — reading from settings on every routing call is less efficient than caching the value as a field, consistent with how numberOfShards and numberOfReplicas are handled. However, the improved_code uses comments rather than actual code, making it incomplete as a concrete implementation suggestion.

Low
Clarify hash combination for virtual shard routing

The hash computation uses Murmur3HashFunction.hash(effectiveRouting) +
partitionOffset, but effectiveRouting may already be hashed or transformed upstream.
More importantly, the existing calculateScaledShardId path also hashes
effectiveRouting — but the virtual shard path re-hashes it independently. This is
inconsistent with how partitionOffset is computed (using id, not effectiveRouting).
The partitionOffset is computed from id via Murmur3HashFunction.hash(id), so adding
it to hash(effectiveRouting) may produce unexpected distribution. Ensure the hash
combination logic matches the intended routing semantics and is consistent with the
non-virtual path.

server/src/main/java/org/opensearch/cluster/routing/OperationRouting.java [525-530]

 int numVirtualShards = indexMetadata.getNumberOfVirtualShards();
 if (numVirtualShards != -1) {
-    final int hash = Murmur3HashFunction.hash(effectiveRouting) + partitionOffset;
+    final int routingHash = Murmur3HashFunction.hash(effectiveRouting);
+    final int hash = routingHash + partitionOffset;
     int vShardId = Math.floorMod(hash, numVirtualShards);
     return VirtualShardRoutingHelper.resolvePhysicalShardId(indexMetadata, vShardId);
 }
Suggestion importance[1-10]: 2

__

Why: The improved_code is functionally identical to the existing_code — it just extracts Murmur3HashFunction.hash(effectiveRouting) into a local variable routingHash before adding partitionOffset. This is purely a readability change with no behavioral difference, and the suggestion's concern about hash combination semantics is not actually addressed by the proposed code change.

Low
Suggestions up to commit d13a28c
CategorySuggestion                                                                                                                                    Impact
Possible issue
Explicitly handle disabled virtual shards case

The validation check uses numVirtualShards < numPhysicalShards but should also
handle the case where numVirtualShards == -1 (disabled). When virtual shards are
disabled (numVirtualShards == -1), the condition numVirtualShards <
numPhysicalShards will be true (since -1 < any positive number), so the error will
be thrown correctly. However, the check numVirtualShards % numPhysicalShards != 0
for a value of -1 could produce unexpected results depending on the JVM. It's
clearer and safer to explicitly check for the disabled case first.

server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java [39-43]

-if (numVirtualShards < numPhysicalShards || numVirtualShards % numPhysicalShards != 0) {
+if (numVirtualShards == -1 || numVirtualShards < numPhysicalShards || numVirtualShards % numPhysicalShards != 0) {
     throw new IllegalArgumentException(
         "Virtual shards must be enabled and be a multiple of the number of physical shards to resolve routing."
     );
 }
Suggestion importance[1-10]: 6

__

Why: While -1 < numPhysicalShards already handles the disabled case correctly (throwing the exception), explicitly checking numVirtualShards == -1 makes the intent clearer and avoids any ambiguity about the -1 % numPhysicalShards behavior (which in Java returns -1, not 0, so the existing check already handles it). This is a readability/safety improvement rather than a bug fix.

Low
General
Cache virtual shard count as a field

getNumberOfVirtualShards() reads directly from settings each time it is called,
unlike getNumberOfShards() which returns a cached field numberOfShards. This is
inconsistent and may cause performance issues if called frequently during routing.
The value should be cached as a field during construction, similar to how
numberOfShards and numberOfReplicas are handled.

server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java [1350-1352]

 public int getNumberOfVirtualShards() {
-    return settings.getAsInt(SETTING_NUMBER_OF_VIRTUAL_SHARDS, -1);
+    return INDEX_NUMBER_OF_VIRTUAL_SHARDS_SETTING.get(settings);
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly identifies an inconsistency where getNumberOfVirtualShards() reads from settings on every call while other similar methods use cached fields. However, the improved_code still reads from settings (just using the typed setting instead of raw getAsInt), so it doesn't actually implement the caching improvement described in the suggestion content.

Low
Prevent integer overflow in hash computation

The hash computation Murmur3HashFunction.hash(effectiveRouting) + partitionOffset
can overflow since both values are int. This is the same pattern used in
calculateScaledShardId, but the result is passed to Math.floorMod which handles
negative values correctly. However, integer overflow could still produce unexpected
hash distributions. Consider using a long or masking to avoid overflow, consistent
with how the existing code handles it.

server/src/main/java/org/opensearch/cluster/routing/OperationRouting.java [525-530]

 int numVirtualShards = indexMetadata.getNumberOfVirtualShards();
 if (numVirtualShards != -1) {
-    final int hash = Murmur3HashFunction.hash(effectiveRouting) + partitionOffset;
+    final int hash = (int) ((Murmur3HashFunction.hash(effectiveRouting) & 0xFFFFFFFFL) + partitionOffset);
     int vShardId = Math.floorMod(hash, numVirtualShards);
     return VirtualShardRoutingHelper.resolvePhysicalShardId(indexMetadata, vShardId);
 }
Suggestion importance[1-10]: 3

__

Why: The existing calculateScaledShardId method uses the same hash + partitionOffset pattern without overflow protection, so this change would create inconsistency. Additionally, the improved_code mixes long and int arithmetic in a way that doesn't cleanly prevent overflow, and Math.floorMod already handles negative values correctly.

Low
Suggestions up to commit d188b49
CategorySuggestion                                                                                                                                    Impact
General
Add explicit disabled-state check before validation

The validation check uses numVirtualShards < numPhysicalShards but doesn't handle
the case where numVirtualShards == -1 (disabled) explicitly. When virtual shards are
disabled (numVirtualShards == -1), the condition -1 < numPhysicalShards is always
true and will throw, which is correct, but the check numVirtualShards %
numPhysicalShards != 0 with -1 could produce confusing behavior. Add an explicit
check for the disabled case to make the validation intent clear and avoid relying on
implicit numeric comparisons.

server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java [39-43]

+if (numVirtualShards == -1) {
+    throw new IllegalArgumentException(
+        "Virtual shards must be enabled to resolve routing."
+    );
+}
 if (numVirtualShards < numPhysicalShards || numVirtualShards % numPhysicalShards != 0) {
     throw new IllegalArgumentException(
-        "Virtual shards must be enabled and be a multiple of the number of physical shards to resolve routing."
+        "Virtual shards must be a multiple of the number of physical shards to resolve routing."
     );
 }
Suggestion importance[1-10]: 6

__

Why: The suggestion improves code clarity by making the disabled-state check explicit rather than relying on implicit numeric comparison. While the current code works correctly (since -1 < numPhysicalShards is always true), the explicit check makes the intent clearer and provides a more descriptive error message for each failure case.

Low
Cache virtual shard count as a field

getNumberOfVirtualShards() reads directly from settings on every call, while
numberOfShards and numberOfReplicas are stored as cached fields. This is
inconsistent and may cause performance issues if called frequently in hot paths like
shard routing. The value should be stored as a field during construction, similar to
how numberOfShards is handled.

server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java [1314-1316]

 public int getNumberOfVirtualShards() {
-    return settings.getAsInt(SETTING_NUMBER_OF_VIRTUAL_SHARDS, -1);
+    return INDEX_NUMBER_OF_VIRTUAL_SHARDS_SETTING.get(settings);
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly identifies an inconsistency where getNumberOfVirtualShards() reads from settings on every call while other fields like numberOfShards are cached. However, the improved_code still reads from settings (just using INDEX_NUMBER_OF_VIRTUAL_SHARDS_SETTING.get(settings) instead of settings.getAsInt()), so it doesn't actually implement the caching that was suggested.

Low
Avoid redundant hash recomputation in routing path

The virtual shard routing path computes Murmur3HashFunction.hash(effectiveRouting) +
partitionOffset but the non-virtual path uses calculateScaledShardId which
internally also hashes effectiveRouting. The virtual shard path re-hashes
effectiveRouting independently, which means the hash computation is duplicated and
inconsistent with the partitioned index logic. The partitionOffset is already
computed from id, so the hash of effectiveRouting should be computed once and
reused, or the existing hash computation from calculateScaledShardId should be
leveraged to avoid inconsistency.

server/src/main/java/org/opensearch/cluster/routing/OperationRouting.java [525-530]

 int numVirtualShards = indexMetadata.getNumberOfVirtualShards();
 if (numVirtualShards != -1) {
-    final int hash = Murmur3HashFunction.hash(effectiveRouting) + partitionOffset;
+    final int routingHash = Murmur3HashFunction.hash(effectiveRouting);
+    final int hash = routingHash + partitionOffset;
     int vShardId = Math.floorMod(hash, numVirtualShards);
     return VirtualShardRoutingHelper.resolvePhysicalShardId(indexMetadata, vShardId);
 }
Suggestion importance[1-10]: 2

__

Why: The improved_code is functionally identical to the existing_code - it just extracts Murmur3HashFunction.hash(effectiveRouting) into a local variable routingHash, but the hash is still computed once in both versions. This is a minor readability change with no real impact on correctness or performance.

Low
Suggestions up to commit d188b49
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix missing disabled-state check in validation

The validation check uses numVirtualShards < numPhysicalShards but should also
handle the case where virtual shards are disabled (i.e., numVirtualShards == -1). A
negative value would pass the modulo check but represents a disabled state, so the
condition should explicitly check for numVirtualShards <= 0 or numVirtualShards ==
-1 as the disabled sentinel value.

server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java [39-43]

-if (numVirtualShards < numPhysicalShards || numVirtualShards % numPhysicalShards != 0) {
+if (numVirtualShards == -1 || numVirtualShards < numPhysicalShards || numVirtualShards % numPhysicalShards != 0) {
     throw new IllegalArgumentException(
         "Virtual shards must be enabled and be a multiple of the number of physical shards to resolve routing."
     );
 }
Suggestion importance[1-10]: 7

__

Why: When numVirtualShards == -1 (disabled), the condition numVirtualShards < numPhysicalShards is true (since -1 < any positive number), so the exception is already thrown. The existing code correctly handles this case, making the suggestion technically redundant. However, adding an explicit check for -1 improves code clarity and intent.

Medium
General
Cache virtual shards value as a field

The method reads the virtual shards value directly from settings each time it is
called, unlike getNumberOfShards() which returns a cached field numberOfShards. This
is inconsistent and potentially inefficient. The value should be stored as a field
during construction (similar to numberOfShards) and returned from that field.

server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java [1314-1316]

 public int getNumberOfVirtualShards() {
-    return settings.getAsInt(SETTING_NUMBER_OF_VIRTUAL_SHARDS, -1);
+    return numberOfVirtualShards;
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion is valid — reading from settings on every call is inconsistent with how numberOfShards is handled as a cached field. However, the improved_code references a numberOfVirtualShards field that doesn't exist in the PR diff, requiring additional changes not shown.

Low
Guard against integer overflow in hash computation

The hash computation uses Murmur3HashFunction.hash(effectiveRouting) +
partitionOffset with plain integer addition, which can overflow. The existing
calculateScaledShardId likely handles this with Math.floorMod, but the intermediate
addition here could produce unexpected negative values before floorMod is applied.
Use Math.floorMod on the combined hash to be consistent and safe.

server/src/main/java/org/opensearch/cluster/routing/OperationRouting.java [525-530]

+int numVirtualShards = indexMetadata.getNumberOfVirtualShards();
+if (numVirtualShards != -1) {
+    final int hash = Murmur3HashFunction.hash(effectiveRouting) + partitionOffset;
+    int vShardId = Math.floorMod(hash, numVirtualShards);
+    return VirtualShardRoutingHelper.resolvePhysicalShardId(indexMetadata, vShardId);
+}
 
-
Suggestion importance[1-10]: 2

__

Why: The improved_code is identical to the existing_code, meaning no actual change is proposed. The suggestion identifies a potential overflow concern but doesn't provide a concrete fix, making it unhelpful.

Low
Suggestions up to commit 679535f
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix hash overflow in virtual shard routing

The virtual shard routing path computes the hash as
Murmur3HashFunction.hash(effectiveRouting) + partitionOffset, but the non-virtual
path delegates to calculateScaledShardId which uses effectiveRouting directly. This
inconsistency means the two paths may produce different results for the same input
when virtual shards are disabled vs. enabled with the same number of virtual shards
as physical shards. The partitionOffset should be applied consistently, but more
importantly, the hash computation should match the legacy path to ensure predictable
behavior during migration.

server/src/main/java/org/opensearch/cluster/routing/OperationRouting.java [525-530]

 int numVirtualShards = indexMetadata.getNumberOfVirtualShards();
 if (numVirtualShards != -1) {
-    final int hash = Murmur3HashFunction.hash(effectiveRouting) + partitionOffset;
-    int vShardId = Math.floorMod(hash, numVirtualShards);
+    final int hash = Math.floorMod(Murmur3HashFunction.hash(effectiveRouting), numVirtualShards);
+    int vShardId = Math.floorMod(hash + partitionOffset, numVirtualShards);
     return VirtualShardRoutingHelper.resolvePhysicalShardId(indexMetadata, vShardId);
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion raises a valid concern about hash computation consistency, but the improved_code changes the semantics in a way that doesn't clearly match the described fix. The original code Murmur3HashFunction.hash(effectiveRouting) + partitionOffset followed by Math.floorMod is a standard pattern, and the "inconsistency" with the legacy path is intentional since virtual shards are a new routing mechanism. The suggestion's improved code also changes the order of operations in a potentially incorrect way.

Low
Prevent zero virtual shards causing division by zero

The minimum value for INDEX_NUMBER_OF_VIRTUAL_SHARDS_SETTING is set to -1, which
allows any value >= -1 including 0. A value of 0 virtual shards would cause a
division-by-zero error in Math.floorMod(hash, numVirtualShards) during routing. The
minimum valid enabled value should be at least 1 (with -1 as the sentinel for
disabled), so the setting validator should reject 0.

server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java [271-277]

 public static final Setting<Integer> INDEX_NUMBER_OF_VIRTUAL_SHARDS_SETTING = Setting.intSetting(
     SETTING_NUMBER_OF_VIRTUAL_SHARDS,
     -1,
     -1,
     Property.IndexScope,
     Property.Final
 );
+// Note: value of 0 must be explicitly rejected in the build() validation below
Suggestion importance[1-10]: 3

__

Why: The concern about a value of 0 causing division-by-zero in Math.floorMod is valid, but the improved_code is identical to the existing_code (only adds a comment), making it a documentation-only suggestion rather than an actual fix. The real fix would require adding validation in the build() method.

Low
General
Clarify fallthrough behavior after invalid override

When the override value is out of bounds, the code logs a trace message and falls
through to the modulo fallback. However, there is no explicit return or break after
the log statement — the code implicitly falls through to Math.floorMod(vShardId,
...) at the end of the method. This is correct behavior, but the missing else or
explicit fallthrough comment makes it easy to accidentally insert code between the
log and the fallback. Consider adding an explicit comment or restructuring for
clarity.

server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java [48-51]

 if (pShardId >= 0 && pShardId < indexMetadata.getNumberOfShards()) {
     return pShardId;
 }
+// Override out of bounds, fall back to modulo
 logger.trace("Invalid override value [{}] for vShard [{}]: out of bounds", pShardId, vShardId);
Suggestion importance[1-10]: 2

__

Why: This is a minor readability suggestion that only adds a comment. The existing code is already clear in its control flow, and this change offers minimal practical improvement.

Low

@github-actions
Copy link
Contributor

❌ Gradle check result for 679535f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@atris atris closed this Mar 2, 2026
@atris atris reopened this Mar 2, 2026
@github-project-automation github-project-automation bot moved this from ✅ Done to 🏗 In progress in Shard Management Project Board Mar 2, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 679535f.

PathLineSeverityDescription
server/src/test/java/org/opensearch/common/settings/ConsistentSettingsServiceTests.java62lowAdding Assume.assumeNoException in @before causes the entire ConsistentSettingsServiceTests class to be silently skipped when PBKDF2WithHmacSHA512 is unavailable. While this pattern is common for algorithm-unavailability handling, applying it to a security-settings test class means all consistency checks for secure settings could be bypassed in FIPS or restricted-algorithm environments without any test failure signal.
server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java43lowresolvePhysicalShardId reads routing override mappings directly from index custom metadata (keyed by VIRTUAL_SHARDS_CUSTOM_METADATA_KEY). If an attacker can write to cluster state or index metadata (e.g., via a compromised node or API misuse), they could inject override entries to deterministically redirect specific virtual shards to chosen physical shards, influencing data locality in unexpected ways. The feature is legitimate, but the absence of any access control or integrity check on the override map is worth noting.

The table above displays the top 10 most important findings.

Total: 2 | Critical: 0 | High: 0 | Medium: 0 | Low: 2


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

Persistent review updated to latest commit 679535f

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

❌ Gradle check result for 679535f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@atris atris closed this Mar 3, 2026
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Shard Management Project Board Mar 3, 2026
@atris atris reopened this Mar 3, 2026
@github-project-automation github-project-automation bot moved this from ✅ Done to 🏗 In progress in Shard Management Project Board Mar 3, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

Persistent review updated to latest commit 679535f

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

❌ Gradle check result for 679535f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

Persistent review updated to latest commit d188b49

@github-actions
Copy link
Contributor

❌ Gradle check result for d188b49: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@atris atris closed this Mar 11, 2026
@atris atris reopened this Mar 11, 2026
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Shard Management Project Board Mar 11, 2026
@github-actions
Copy link
Contributor

Persistent review updated to latest commit d13a28c

@vikasvb90
Copy link
Contributor

We also need to figure out a path for our upcoming CRs to see how virtual shards can work alongside shard split.

@github-actions
Copy link
Contributor

❌ Gradle check result for d13a28c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@atris atris closed this Mar 13, 2026
@github-project-automation github-project-automation bot moved this from 👀 In review to ✅ Done in Shard Management Project Board Mar 13, 2026
@atris atris reopened this Mar 13, 2026
@github-project-automation github-project-automation bot moved this from ✅ Done to 🏗 In progress in Shard Management Project Board Mar 13, 2026
@github-actions
Copy link
Contributor

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit d13a28c.

PathLineSeverityDescription
server/src/main/java/org/opensearch/cluster/metadata/VirtualShardRoutingHelper.java47mediumThe routing override mechanism reads arbitrary shard remapping from index custom metadata (`virtual_shards_routing`). While it validates bounds, it allows any principal with cluster metadata write access to silently redirect specific virtual shard IDs to different physical shards, potentially targeting where specific documents are stored. This is not a typical routing design and introduces a privileged covert redirection channel. The feature purpose (shard merging) does not clearly justify per-virtual-shard overrides at runtime.
server/src/test/java/org/opensearch/common/settings/ConsistentSettingsServiceTests.java62lowThe added `Assume.assumeNoException` block silently skips all `ConsistentSettingsService` tests when PBKDF2WithHmacSHA512 is unavailable. This service handles secure/encrypted cluster settings. Skipping its test suite on affected CI environments means security regressions in settings consistency checks would go undetected. The change is plausible for JVM compatibility but could be used to suppress failing security tests.

The table above displays the top 10 most important findings.

Total: 2 | Critical: 0 | High: 0 | Medium: 1 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@atris
Copy link
Contributor Author

atris commented Mar 13, 2026

@atris We are also starting to roll out shard split and I have raised this PR to resolve child shard ids. To give you an idea of how the exact implementation of split would look like, I am linking my private repo branch https://github.com/vikasvb90/OpenSearch/tree/split_final.

This was design of shard split published earlier and implementation is going to be based on this design

Since both virtual shards and shard split would require operation routing algo changes, we will need to converge here. We can discuss more on convergence.

cc: @shwetathareja

Thanks for sharing this.

This PR has been open for ~2 weeks and is already approved, so let’s converge by stacking split changes on top of it instead of diverging OperationRouting in parallel.

Please rebase your split branch on this PR (or open it as a dependent PR) so we can review concrete deltas directly.

I’ll review your split implementation and propose a concrete convergence plan covering routing invariants, behavior, and tests.

Please also share your view of the convergence points you think are most critical so we can align quickly.

@atris atris added the skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis. label Mar 13, 2026
@github-actions
Copy link
Contributor

Persistent review updated to latest commit d13a28c

@github-actions
Copy link
Contributor

❌ Gradle check result for d13a28c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

Persistent review updated to latest commit d13a28c

@github-actions
Copy link
Contributor

❌ Gradle check result for d13a28c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

✅ Gradle check result for d13a28c: SUCCESS

@atris atris merged commit a756e88 into opensearch-project:main Mar 14, 2026
109 of 132 checks passed
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Shard Management Project Board Mar 14, 2026
@vikasvb90
Copy link
Contributor

vikasvb90 commented Mar 15, 2026

@atris we can build the support of split on shard of the index consisting of virtual shards incrementally as this may require us to go through multiple components of OpenSearch architecture. For now, I think we can keep them exclusive to each other. An index having virtual shards will not allow a split and vice-versa. Once both virtual shard and shard split features are rolled out and we move them out of experimental phase, we can plan for converging them. Thoughts?

shayush622 pushed a commit to shayush622/OpenSearch that referenced this pull request Mar 16, 2026
…earch-project#20745)

* Add Virtual Shards routing and mapping overrides

Implements the initial routing foundation for Virtual Shards.

Adds settings validation for virtual shards and routing overrides via IndexMetadata custom data.

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

* Add Virtual Shards routing and mapping overrides

This PR adds the initial routing and metadata groundwork for virtual shards.

When enabled, routing now uses a virtual shard id (vShardId) before resolving to a physical shard. This separates hash-space size from physical shard count and prepares the path for future shard movement workflows.

What’s included

- New static index setting: index.number_of_virtual_shards (default -1, disabled).
- Routing update in OperationRouting.generateShardId:
    - virtual-shard path when enabled,
    - existing behavior unchanged when disabled.
- New VirtualShardRoutingHelper for vShardId -> physical shard resolution.
- Optional per-index override support via virtual_shards_routing custom metadata.
- Test coverage for:
    - enabled/disabled routing behavior,
    - validation rules,
    - override and fallback behavior.

Out of scope

- Side-car segment extraction flow.
- Transport/state-orchestration for managing override lifecycle.

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

* Add Virtual Shards routing and mapping overrides

This PR adds the initial routing and metadata groundwork for virtual shards.

When enabled, routing now uses a virtual shard id (vShardId) before resolving to a physical shard. This separates hash-space size from physical shard count and prepares the path for future shard movement workflows.

What’s included

- New static index setting: index.number_of_virtual_shards (default -1, disabled).
- Routing update in OperationRouting.generateShardId:
    - virtual-shard path when enabled,
    - existing behavior unchanged when disabled.
- New VirtualShardRoutingHelper for vShardId -> physical shard resolution.
- Optional per-index override support via virtual_shards_routing custom metadata.
- Test coverage for:
    - enabled/disabled routing behavior,
    - validation rules,
    - override and fallback behavior.

Out of scope

- Side-car segment extraction flow.
- Transport/state-orchestration for managing override lifecycle.

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

* Refactor Virtual Shards to use range-based routing

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

* Add fail-safe to VirtualShardRoutingHelper and valid configuration tracking test

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

* Fixed review comments and move to range based sharding

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

---------

Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
}
logger.trace("Invalid override value [{}] for vShard [{}]: out of bounds", pShardId, vShardId);
} catch (NumberFormatException e) {
logger.trace("Invalid override value [{}] for vShard [{}]: not a number", pShardIdStr, vShardId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we swallowing the exception?

/**
* Resolves the physical shard for a virtual shard id.
*/
public static int resolvePhysicalShardId(IndexMetadata indexMetadata, int vShardId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the access pattern, would it be called during the context of every request or cached?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Enhancement or improvement to existing feature or request ShardManagement:Placement skip-changelog skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis.

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

# RFC: Virtual Shards for Elastic Index Scaling in OpenSearch

5 participants