# RFC: Virtual Shards for Elastic Index Scaling in OpenSearch


### Is your feature request related to a problem? Please describe

---

## Summary

This RFC proposes adding **Virtual Shards** to OpenSearch — a lightweight logical partitioning layer between documents and physical shards. Instead of routing directly to a fixed physical shard, documents are first assigned to one of many virtual shards, which are then mapped to physical shards at runtime.

This enables **live rebalancing**, **elastic scaling**, and **temporal locality handling**, without reindexing or changes to clients.

---

##  Motivation and Value

Virtual shards improve OpenSearch’s flexibility and long-term scalability by decoupling routing from physical shard allocation. This allows more granular routing and makes it easier to isolate hotspots, scale incrementally, and optimize storage tiering without operational disruption.

They offer a clean way to:

| Capability             | How Virtual Shards Help                                              |
|------------------------|----------------------------------------------------------------------|
| Elastic Growth         | Add physical shards and remap vShards without reindexing            |
| Hotspot Isolation      | Move only hot vShards (e.g. recent time range or tenant)            |
| Temporal Locality      | Co-locate recent vShards on fast nodes and archive older ones       |
| Operational Simplicity | Start with many vShards, scale physical shards only when needed     |
| Autoscaling Ready      | Enable future rebalancing based on traffic/load                     |
| Index Flexibility      | Avoid over/undersharding, future-proof long-lived indices           |

This design complements existing shard functionality and prepares OpenSearch for large-scale, dynamic, multi-tenant workloads.

---

### Describe the solution you'd like

## Design Overview

### Core Idea

```
document → virtual_shard → physical_shard
```

- Each index declares `number_of_virtual_shards` (e.g. 1024)
- Routing uses `hash(routing_key) % num_vshards` to pick a vShard
- A vShard-to-physical-shard map is stored in cluster state
- Documents are routed to the mapped physical shard internally

---

### Index Configuration Example

```json
{
  "settings": {
    "number_of_shards": 4,
    "number_of_virtual_shards": 1024
  }
}
```

- The index stores data in 4 physical shards
- Routing logic uses 1024 logical vShards
- Internal mapping allows reassignment without index rebuild

---

## What This Enables

| Use Case           | Benefit                                                               |
|--------------------|------------------------------------------------------------------------|
| Live Scale-Out     | Add a physical shard, remap only impacted vShards, no reindexing      |
| Tenant Isolation   | Route specific tenants to their own vShards and scale independently   |
| Temporal Querying  | Route by timestamp, isolate hot vShards, optimize performance         |
| Cold Tiering       | Move older vShards to cold storage without splitting the index        |
| Long-Running Index | Evolve layout over time without changing index name or structure      |

---

## Moving a vShard

To relocate a vShard:

1. Identify and copy documents that hash to that vShard
2. Replay translog updates for the vShard
3. Update the vShard-to-shard mapping in cluster state (atomic)
4. Remove old segments via merge or compaction

The process uses existing replication and fencing mechanisms to ensure safety.

---

## Admin APIs

| Endpoint               | Description                                |
|------------------------|--------------------------------------------|
| `GET /_vshard/state`   | View current vShard → shard mapping        |
| `POST /_vshard/move`   | Manually trigger a vShard relocation       |
| `GET /_vshard/stats`   | View metrics like document count, size, QPS|

These APIs are internal/admin only — not exposed to client applications.

---

## Compatibility

| Feature      | Status                                     |
|--------------|--------------------------------------------|
| ILM          | Compatible; vShards can follow tier policies|
| Replication  | Works with existing shard replication       |
| Routing Key  | Fully supported (impacts vShard assignment) |
| Search APIs  | No client-side changes                      |
| Indexing     | Adds routing layer before shard resolution  |

---

## Use Case: Temporal Data (e.g. Box-like Ingestion)

A system indexing user folders and files over time benefits from this model:

- Routing key includes folder ID and timestamp (e.g. `folder:2024-07`)
- Recent folders map to hot vShards
- Hot vShards are routed to fast SSD-backed physical shards
- Older vShards are moved to cold nodes
- No rollovers or reindexing required

---

## Implementation Plan

### Phase 1 – Core Support

- Add support for `number_of_virtual_shards` as an index setting
- Compute vShard ID per document using routing hash
- Maintain vShard → physical shard mapping in cluster state
- Update indexing and search paths to route through vShards

### Phase 2 – Manual Movement

- Add admin API to move vShards
- Support atomic map update + translog replay per vShard
- Enable parallel movement for disjoint vShards

### Phase 3 – Optional Rebalancer

- Track per-vShard stats: size, document count, query rate
- Add load-aware background balancer (pluggable)
- Future: autoscaler hooks for vShard-based elasticity

---

## Fault Tolerance

- vShard mapping is stored and versioned in cluster metadata
- Routing is atomic and consistent across replicas
- Movement reuses fencing, recovery, and replication infrastructure
- System prevents routing inconsistency during transitions

---

## Defaults and Compatibility

| Item                          | Behavior                          |
|-------------------------------|-----------------------------------|
| Default value                 | `number_of_virtual_shards = null` |
| Existing indices              | No change                         |
| Backward compatibility        | Full                              |
| Client-facing API changes     | None                              |

---

## Observability

- Optional: Add vShard ID to slow logs and request tracing
- Expose `/vshard/stats` for introspection and tuning
- Enable future integrations with load balancers and allocators

---

### Clarifying Relationship to  Dynamic Shard Splitting and Shrinking

This proposal for **Virtual Shards** is not competing with dynamic shard splitting; it operates at a different layer and addresses a different set of challenges. The two proposals are **complementary** and can co-exist to enhance OpenSearch’s flexibility.

| Feature                       | Dynamic Shard Split/Shrink              | Virtual Shards (This RFC)                                         |
|------------------------------|--------------------------------------------------|-------------------------------------------------------------------|
| Unit of Change               | Physical Shard                                   | Logical Virtual Shard (vShard)                                    |
| Purpose                      | Change physical shard count post-creation        | Introduce logical indirection layer for routing flexibility       |
| Operation Cost               | Involves data copying / merges / reindexing      | Lightweight remapping + selective relocation                      |
| Target Use Cases             | Undersharded or oversharded static indices       | Hotspot isolation, temporal skew, multi-tenant scaling            |
| Reindexing Required?         | Yes (implicitly or explicitly)                   | No                                                                |
| Long-Running Index Support   | Limited (splits are disruptive)                  | Yes (vShards remap live)                                          |
| Routing Granularity          | Physical shard                                   | Logical vShard (e.g., 1024 vShards → 4 shards)                    |

---

####  Summary

- **Shard split/shrink** is ideal for permanent, structural changes to physical shard layout post-index-creation.
- **Virtual Shards** provide fine-grained routing flexibility and live rebalancing for dynamic or long-lived workloads — without requiring index restarts or client changes.

---

####  Integration Potential

Virtual Shards can **complement** dynamic shard splitting:
- You could split an overloaded shard and then rebalance only its affected virtual shards across new shards.
- Or use virtual shards to **delay or avoid** physical splits by remapping hotspots to different nodes.

We’re happy to align these mechanisms so they serve orthogonal purposes:
- **Shard splitting**: structural, coarse-grained change
- **Virtual shards**: logical, fine-grained elastic routing

Both together enable OpenSearch to scale cleanly across a broader set of real-world indexing patterns.

##  Final Notes

Virtual shards bring elastic scaling, dynamic isolation, and long-term adaptability to OpenSearch. They are designed to solve real-world challenges like tenant hotspots, temporal skew, and uneven document growth — while keeping the system stable, simple, and compatible.

They integrate seamlessly with existing indexing and search infrastructure, and open up new capabilities without introducing client complexity.

This RFC proposes making virtual sharding a core abstraction within OpenSearch’s routing and indexing path.


### Related component

ShardManagement:Placement

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

# RFC: Virtual Shards for Elastic Index Scaling in OpenSearch #18809

Is your feature request related to a problem? Please describe

Summary

Motivation and Value

Describe the solution you'd like

Design Overview

Core Idea

Index Configuration Example

What This Enables

Moving a vShard

Admin APIs

Compatibility

Use Case: Temporal Data (e.g. Box-like Ingestion)

Implementation Plan

Phase 1 – Core Support

Phase 2 – Manual Movement

Phase 3 – Optional Rebalancer

Fault Tolerance

Defaults and Compatibility

Observability

Clarifying Relationship to Dynamic Shard Splitting and Shrinking

Summary

Integration Potential

Final Notes

Related component

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Capability	How Virtual Shards Help
Elastic Growth	Add physical shards and remap vShards without reindexing
Hotspot Isolation	Move only hot vShards (e.g. recent time range or tenant)
Temporal Locality	Co-locate recent vShards on fast nodes and archive older ones
Operational Simplicity	Start with many vShards, scale physical shards only when needed
Autoscaling Ready	Enable future rebalancing based on traffic/load
Index Flexibility	Avoid over/undersharding, future-proof long-lived indices

Use Case	Benefit
Live Scale-Out	Add a physical shard, remap only impacted vShards, no reindexing
Tenant Isolation	Route specific tenants to their own vShards and scale independently
Temporal Querying	Route by timestamp, isolate hot vShards, optimize performance
Cold Tiering	Move older vShards to cold storage without splitting the index
Long-Running Index	Evolve layout over time without changing index name or structure

Endpoint	Description
`GET /_vshard/state`	View current vShard → shard mapping
`POST /_vshard/move`	Manually trigger a vShard relocation
`GET /_vshard/stats`	View metrics like document count, size, QPS

Feature	Status
ILM	Compatible; vShards can follow tier policies
Replication	Works with existing shard replication
Routing Key	Fully supported (impacts vShard assignment)
Search APIs	No client-side changes
Indexing	Adds routing layer before shard resolution

Item	Behavior
Default value	`number_of_virtual_shards = null`
Existing indices	No change
Backward compatibility	Full
Client-facing API changes	None

Feature	Dynamic Shard Split/Shrink	Virtual Shards (This RFC)
Unit of Change	Physical Shard	Logical Virtual Shard (vShard)
Purpose	Change physical shard count post-creation	Introduce logical indirection layer for routing flexibility
Operation Cost	Involves data copying / merges / reindexing	Lightweight remapping + selective relocation
Target Use Cases	Undersharded or oversharded static indices	Hotspot isolation, temporal skew, multi-tenant scaling
Reindexing Required?	Yes (implicitly or explicitly)	No
Long-Running Index Support	Limited (splits are disruptive)	Yes (vShards remap live)
Routing Granularity	Physical shard	Logical vShard (e.g., 1024 vShards → 4 shards)

# RFC: Virtual Shards for Elastic Index Scaling in OpenSearch #18809

Description

Is your feature request related to a problem? Please describe

Summary

Motivation and Value

Describe the solution you'd like

Design Overview

Core Idea

Index Configuration Example

What This Enables

Moving a vShard

Admin APIs

Compatibility

Use Case: Temporal Data (e.g. Box-like Ingestion)

Implementation Plan

Phase 1 – Core Support

Phase 2 – Manual Movement

Phase 3 – Optional Rebalancer

Fault Tolerance

Defaults and Compatibility

Observability

Clarifying Relationship to Dynamic Shard Splitting and Shrinking

Summary

Integration Potential

Final Notes

Related component

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions