Separate shard limit validation for index and search tiers #136063

ywangd · 2025-10-06T22:32:03Z

In stateless, index and search shards are distinct and must be allocated to nodes of corresponding types. Therefore the shard limit validation should be performed for them separately to avoid one shard type taking more quota than expected, similar to the separation between regular and frozen shards.

Resolves: ES-12884

…ex-search-for-shard-limit-validator

ywangd · 2025-10-07T01:26:49Z

server/src/main/java/org/elasticsearch/indices/ShardLimitValidator.java

+    public enum ResultGroup {
+        NORMAL(NORMAL_GROUP),
+        FROZEN(FROZEN_GROUP),
+        INDEX("index"),
+        SEARCH("search");


The PR is bigger and more involved than I initially expected because the current shard limit validation has a hard-coded 2-member group for "normal" and "frozen" indices. The group is also decided by an index level setting. None of these makes sense in Stateless.

The PR introduces ResultGroup so that the actual groups can be picked based on the setup. It also helps detaching the grouping from the index level setting. Overall it promotes the Group concept (was a String previously) which in turns helps reusing the existing logics (many of them are based on the group in use).

Please let me know if it makes sense. Happy to provide more clarification.

Is it worth using SPI to provide the ResultGroups so serverless can override it and we avoid putting knowledge of those things in the core product?

It may be pedantic and not worth the effort, but it looks generalised enough that it could be done. And we do do it for some other things.

I guess ResultGroup would have to be an interface then

This is a good question and I did briefly consider it initially. Didn't go with it because (1) relatively larger effort (2) there is a bit untangle needed for ShardsCapacityHealthIndicatorService (3) a bit easier to test everything together with existing testing code.

Another reason is that focus of this PR is a bug fix on the stateless side and SPI feels more of an optimization. So I think pursuing SPI separately might be a better option. I can raise a separate ticket for it. Let me know if this makes sense or you'd prefer we pursue it as part of this PR.

Yeah I wouldn't block this on account of the SPI, I'll leave it up to you whether you think it's worth a follow up.

I'll leave this PR open till tomorrow. If no objection by then, I'll log a ticket for future optimization with SPI and merge this PR so that we can get the bug fix part shipped.

elasticsearchmachine · 2025-10-07T01:37:41Z

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

nicktindall

LGTM, just some nits and questions about whether it's worth putting the serverless stuff in the serverless codebase

nicktindall · 2025-10-07T03:08:30Z

server/src/main/java/org/elasticsearch/indices/ShardLimitValidator.java

+        final var resultGroups = applicableResultGroups(isStateless);
+        final Map<ResultGroup, Integer> shardsToCreatePerGroup = new HashMap<>();
+
+        // TODO: we can short circuit when indindicesToOpenices is empty


Nit: typo indindicesToOpenices, also did you mean to act on this TODO before merging?

The TODO is for future since I intend to keep the current behaviour as is. Fixed the typo in 241bf1b

nicktindall · 2025-10-07T03:12:22Z

server/src/main/java/org/elasticsearch/indices/ShardLimitValidator.java

-     *   - otherwise -> returns the Result of checking the limits for _frozen_ nodes
+     * - Check limits for _normal_ nodes
+     * - If there's no room -> return the Result for _normal_ nodes (fail-fast)
+     * - otherwise -> returns the Result of checking the limits for _frozen_ nodes


Nit: this javadoc probably needs to be generalised

Yep generalized in 241bf1b

nicktindall · 2025-10-07T03:15:17Z

server/src/main/java/org/elasticsearch/indices/ShardLimitValidator.java

+    public enum ResultGroup {
+        NORMAL(NORMAL_GROUP),
+        FROZEN(FROZEN_GROUP),
+        INDEX("index"),
+        SEARCH("search");


Is it worth using SPI to provide the ResultGroups so serverless can override it and we avoid putting knowledge of those things in the core product?

It may be pedantic and not worth the effort, but it looks generalised enough that it could be done. And we do do it for some other things.

I guess ResultGroup would have to be an interface then

nicktindall · 2025-10-07T03:17:00Z

server/src/main/java/org/elasticsearch/indices/ShardLimitValidator.java

+                case FROZEN -> nodeCount(discoveryNodes, ShardLimitValidator::hasFrozen);
+                case INDEX -> nodeCount(discoveryNodes, node -> node.hasRole(DiscoveryNodeRole.INDEX_ROLE.roleName()));
+                case SEARCH -> nodeCount(discoveryNodes, node -> node.hasRole(DiscoveryNodeRole.SEARCH_ROLE.roleName()));
+            };


Nit: could make these abstract in the enum class and put the implementations on the individual declarations? that would avoid the need for the switch. Same as below.

Good point. Done in 035f2ef

nicktindall · 2025-10-07T03:19:02Z

server/src/main/java/org/elasticsearch/indices/ShardLimitValidator.java

+         * @return The total number of new shards to be created for this group.
+         */
+        public int newShardsTotal(Settings indexSettings) {
+            final boolean frozen = FROZEN_GROUP.equals(INDEX_SETTING_SHARD_LIMIT_GROUP.get(indexSettings));


Nit: this is a bit tricky to read, perhaps inFrozenLimitGroup instead of frozen or something?

I renamed it to isFrozenIndex. I didn't include group since the enum (this) in the context is the group.

nicktindall · 2025-10-07T03:21:43Z

server/src/main/java/org/elasticsearch/indices/ShardLimitValidator.java

            + ReferenceDocs.MAX_SHARDS_PER_NODE;
    }

+    public enum ResultGroup {


Nit: perhaps LimitGroup? I'm not entirely clear why "result" is in the name? perhaps I'm missing something

Changed it to LimitGroup which seems to be overall better, see 0dcd553. The Result part was taken from the validation Result inner class name which has a group field of the enum type (previously a String).

…ex-search-for-shard-limit-validator

ywangd · 2025-10-08T01:59:32Z

@elasticmachine update branch

…it-validator

ywangd added 7 commits October 6, 2025 14:04

Initial refactor for more flexible groupping in ShardLimitsValidator

a908117

refactor ShardsCapacityHealthIndicatorService

fb0bc24

more refactoring

e990dcc

expand ShardLimitValidatorTests

01ba22f

add stateless tests for ShardsCapacityHealthIndicatorService

03ebc8e

tweak and fix

8972387

Merge remote-tracking branch 'origin/main' into ES-12884-separate-ind…

5f1df68

…ex-search-for-shard-limit-validator

elasticsearchmachine added the v9.3.0 label Oct 6, 2025

ywangd added 2 commits October 7, 2025 10:13

fix tests

dfb65a7

fix test

36e8ccb

ywangd commented Oct 7, 2025

View reviewed changes

ywangd added >non-issue :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Oct 7, 2025

ywangd marked this pull request as ready for review October 7, 2025 01:37

ywangd requested review from nicktindall, DiannaHohensee and henningandersen October 7, 2025 01:37

elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Oct 7, 2025

nicktindall approved these changes Oct 7, 2025

View reviewed changes

ywangd added 5 commits October 7, 2025 15:39

Merge remote-tracking branch 'origin/main' into ES-12884-separate-ind…

568ebbe

…ex-search-for-shard-limit-validator

rename

0dcd553

frozen rename

e026993

remove switch

035f2ef

Fix comments, typos and names

241bf1b

elasticmachine and others added 2 commits October 8, 2025 03:59

Merge branch 'main' into ES-12884-separate-index-search-for-shard-lim…

d06dd6a

…it-validator

Merge branch 'main' into ES-12884-separate-index-search-for-shard-lim…

095eafb

…it-validator

Separate shard limit validation for index and search tiers #136063

Are you sure you want to change the base?

Separate shard limit validation for index and search tiers #136063

Conversation

ywangd commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Oct 7, 2025

Uh oh!

nicktindall left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ywangd commented Oct 8, 2025

Uh oh!

Uh oh!

ywangd commented Oct 6, 2025 •

edited

Loading