Skip to content

# RFC: Virtual Shards for Elastic Index Scaling in OpenSearch #18809

@atris

Description

@atris

Is your feature request related to a problem? Please describe


Summary

This RFC proposes adding Virtual Shards to OpenSearch — a lightweight logical partitioning layer between documents and physical shards. Instead of routing directly to a fixed physical shard, documents are first assigned to one of many virtual shards, which are then mapped to physical shards at runtime.

This enables live rebalancing, elastic scaling, and temporal locality handling, without reindexing or changes to clients.


Motivation and Value

Virtual shards improve OpenSearch’s flexibility and long-term scalability by decoupling routing from physical shard allocation. This allows more granular routing and makes it easier to isolate hotspots, scale incrementally, and optimize storage tiering without operational disruption.

They offer a clean way to:

Capability How Virtual Shards Help
Elastic Growth Add physical shards and remap vShards without reindexing
Hotspot Isolation Move only hot vShards (e.g. recent time range or tenant)
Temporal Locality Co-locate recent vShards on fast nodes and archive older ones
Operational Simplicity Start with many vShards, scale physical shards only when needed
Autoscaling Ready Enable future rebalancing based on traffic/load
Index Flexibility Avoid over/undersharding, future-proof long-lived indices

This design complements existing shard functionality and prepares OpenSearch for large-scale, dynamic, multi-tenant workloads.


Describe the solution you'd like

Design Overview

Core Idea

document → virtual_shard → physical_shard
  • Each index declares number_of_virtual_shards (e.g. 1024)
  • Routing uses hash(routing_key) % num_vshards to pick a vShard
  • A vShard-to-physical-shard map is stored in cluster state
  • Documents are routed to the mapped physical shard internally

Index Configuration Example

{
  "settings": {
    "number_of_shards": 4,
    "number_of_virtual_shards": 1024
  }
}
  • The index stores data in 4 physical shards
  • Routing logic uses 1024 logical vShards
  • Internal mapping allows reassignment without index rebuild

What This Enables

Use Case Benefit
Live Scale-Out Add a physical shard, remap only impacted vShards, no reindexing
Tenant Isolation Route specific tenants to their own vShards and scale independently
Temporal Querying Route by timestamp, isolate hot vShards, optimize performance
Cold Tiering Move older vShards to cold storage without splitting the index
Long-Running Index Evolve layout over time without changing index name or structure

Moving a vShard

To relocate a vShard:

  1. Identify and copy documents that hash to that vShard
  2. Replay translog updates for the vShard
  3. Update the vShard-to-shard mapping in cluster state (atomic)
  4. Remove old segments via merge or compaction

The process uses existing replication and fencing mechanisms to ensure safety.


Admin APIs

Endpoint Description
GET /_vshard/state View current vShard → shard mapping
POST /_vshard/move Manually trigger a vShard relocation
GET /_vshard/stats View metrics like document count, size, QPS

These APIs are internal/admin only — not exposed to client applications.


Compatibility

Feature Status
ILM Compatible; vShards can follow tier policies
Replication Works with existing shard replication
Routing Key Fully supported (impacts vShard assignment)
Search APIs No client-side changes
Indexing Adds routing layer before shard resolution

Use Case: Temporal Data (e.g. Box-like Ingestion)

A system indexing user folders and files over time benefits from this model:

  • Routing key includes folder ID and timestamp (e.g. folder:2024-07)
  • Recent folders map to hot vShards
  • Hot vShards are routed to fast SSD-backed physical shards
  • Older vShards are moved to cold nodes
  • No rollovers or reindexing required

Implementation Plan

Phase 1 – Core Support

  • Add support for number_of_virtual_shards as an index setting
  • Compute vShard ID per document using routing hash
  • Maintain vShard → physical shard mapping in cluster state
  • Update indexing and search paths to route through vShards

Phase 2 – Manual Movement

  • Add admin API to move vShards
  • Support atomic map update + translog replay per vShard
  • Enable parallel movement for disjoint vShards

Phase 3 – Optional Rebalancer

  • Track per-vShard stats: size, document count, query rate
  • Add load-aware background balancer (pluggable)
  • Future: autoscaler hooks for vShard-based elasticity

Fault Tolerance

  • vShard mapping is stored and versioned in cluster metadata
  • Routing is atomic and consistent across replicas
  • Movement reuses fencing, recovery, and replication infrastructure
  • System prevents routing inconsistency during transitions

Defaults and Compatibility

Item Behavior
Default value number_of_virtual_shards = null
Existing indices No change
Backward compatibility Full
Client-facing API changes None

Observability

  • Optional: Add vShard ID to slow logs and request tracing
  • Expose /vshard/stats for introspection and tuning
  • Enable future integrations with load balancers and allocators

Clarifying Relationship to Dynamic Shard Splitting and Shrinking

This proposal for Virtual Shards is not competing with dynamic shard splitting; it operates at a different layer and addresses a different set of challenges. The two proposals are complementary and can co-exist to enhance OpenSearch’s flexibility.

Feature Dynamic Shard Split/Shrink Virtual Shards (This RFC)
Unit of Change Physical Shard Logical Virtual Shard (vShard)
Purpose Change physical shard count post-creation Introduce logical indirection layer for routing flexibility
Operation Cost Involves data copying / merges / reindexing Lightweight remapping + selective relocation
Target Use Cases Undersharded or oversharded static indices Hotspot isolation, temporal skew, multi-tenant scaling
Reindexing Required? Yes (implicitly or explicitly) No
Long-Running Index Support Limited (splits are disruptive) Yes (vShards remap live)
Routing Granularity Physical shard Logical vShard (e.g., 1024 vShards → 4 shards)

Summary

  • Shard split/shrink is ideal for permanent, structural changes to physical shard layout post-index-creation.
  • Virtual Shards provide fine-grained routing flexibility and live rebalancing for dynamic or long-lived workloads — without requiring index restarts or client changes.

Integration Potential

Virtual Shards can complement dynamic shard splitting:

  • You could split an overloaded shard and then rebalance only its affected virtual shards across new shards.
  • Or use virtual shards to delay or avoid physical splits by remapping hotspots to different nodes.

We’re happy to align these mechanisms so they serve orthogonal purposes:

  • Shard splitting: structural, coarse-grained change
  • Virtual shards: logical, fine-grained elastic routing

Both together enable OpenSearch to scale cleanly across a broader set of real-world indexing patterns.

Final Notes

Virtual shards bring elastic scaling, dynamic isolation, and long-term adaptability to OpenSearch. They are designed to solve real-world challenges like tenant hotspots, temporal skew, and uneven document growth — while keeping the system stable, simple, and compatible.

They integrate seamlessly with existing indexing and search infrastructure, and open up new capabilities without introducing client complexity.

This RFC proposes making virtual sharding a core abstraction within OpenSearch’s routing and indexing path.

Related component

ShardManagement:Placement

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    ✅ Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions