-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
Describe the bug
Description
When using Searchable Snapshots on warm nodes in OpenSearch 3.x, the cluster enforces
a hard limit of 1000 REMOTE_CAPABLE shards per warm node, regardless of the value
configured in cluster.max_shards_per_node.
This results in the following error when attempting to mount additional snapshots:
Validation Failed: 1: this action would add [5] total REMOTE_CAPABLE shards,
but this cluster currently has [8000]/[8000] maximum REMOTE_CAPABLE shards open
Environment
- OpenSearch version: 3.x
- Cluster setup: 8 dedicated warm nodes with
node.roles=warm cluster.max_shards_per_node: 7000 (applies to hot/data nodes only)- REMOTE_CAPABLE limit: 8000 (8 nodes Γ 1000 default β not configurable)
Root Cause
cluster.max_shards_per_node only applies to non-frozen/non-warm data nodes.
The REMOTE_CAPABLE shard pool used by warm nodes for Searchable Snapshots has
a separate hardcoded default of 1000 per node with no public setting to override it.
The Elasticsearch equivalent (cluster.max_shards_per_node.frozen) does not exist
in OpenSearch and returns a settings_exception if attempted:
{
"error": {
"type": "settings_exception",
"reason": "persistent setting [cluster.max_shards_per_node.frozen], not recognized"
}
}Expected Behavior
A dedicated, configurable cluster setting should exist to control the maximum number
of REMOTE_CAPABLE shards per warm node β for example:
PUT _cluster/settings
{
"persistent": {
"cluster.max_shards_per_node.warm": 3000
}
}Workarounds (insufficient for production use)
- Add more warm nodes (each adds 1000 to the limit)
- Close unused searchable snapshot indices
- Reduce shard count per index when mounting snapshots
Impact
Clusters with a large number of searchable snapshot indices on warm nodes hit this
limit without any way to increase it via configuration, forcing costly infrastructure
scaling as the only option.
Related component
Storage:Snapshots
To Reproduce
To Reproduce
- Set up an OpenSearch 3.x cluster with dedicated warm nodes (
node.roles=warm) - Configure
cluster.max_shards_per_node: 7000in cluster settings - Mount searchable snapshots on warm nodes until reaching 1000 shards Γ number of warm nodes:
POST /_snapshot/my-repository/my-snapshot/_restore
{
"storage_type": "remote_snapshot",
"indices": "my-index"
}
- Attempt to mount one additional searchable snapshot
- Observe the following error:
Validation Failed: 1: this action would add [5] total REMOTE_CAPABLE shards,
but this cluster currently has [8000]/[8000] maximum REMOTE_CAPABLE shards open
- Attempt to raise the limit using
cluster.max_shards_per_node.frozen(Elasticsearch equivalent):
PUT _cluster/settings
{
"persistent": {
"cluster.max_shards_per_node.frozen": 3000
}
}- Observe that the setting is not recognized:
{
"error": {
"type": "settings_exception",
"reason": "persistent setting [cluster.max_shards_per_node.frozen], not recognized"
},
"status": 400
}Expected behavior
Expected Behavior
A dedicated cluster setting should exist to control the maximum number of REMOTE_CAPABLE
shards per warm node, similar to how cluster.max_shards_per_node works for hot/data nodes
and how cluster.max_shards_per_node.frozen works in Elasticsearch for frozen nodes.
For example:
PUT _cluster/settings
{
"persistent": {
"cluster.max_shards_per_node.warm": 3000
}
}This would result in:
- 8 warm nodes Γ 3000 = 24,000 maximum REMOTE_CAPABLE shards
- Operators can tune the limit based on their hardware capacity (heap, disk cache)
- No forced infrastructure scaling just to increase a hardcoded limit
Current Behavior
The REMOTE_CAPABLE shard limit for warm nodes is hardcoded at 1000 per node and
cannot be changed via any cluster setting. The only workarounds are:
- Adding more warm nodes (each adds 1000 to the limit)
- Closing unused searchable snapshot indices
- Reducing shard count when mounting snapshots
None of these are acceptable as a long-term solution for production clusters with
large amounts of searchable snapshot data.
Additional Details
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status