Waiting for desired allocation before starting shard snapshot #130300

ywangd · 2025-06-30T00:23:11Z

If the allocation of a shard is undesired, its snapshort now waits for the desired allocation.

elasticsearchmachine · 2025-06-30T01:51:20Z

Hi @ywangd, I've created a changelog YAML for you.

DaveCTurner

I think this won't do what we want. The problematic situation is where a snapshot starts and then the cluster scales up, but the ongoing snapshot prevents shards from moving to the scaled-up nodes. In that case, at the time the snapshot starts all the shards are in their desired locations so they'll go straight through to INIT.

Instead we need to bound the number of shards in state INIT (on each node) so that, should a scale-up occur, most of the shards are free to move already.

ywangd · 2025-06-30T05:50:06Z

The problematic situation is where a snapshot starts and then the cluster scales up, but the ongoing snapshot prevents shards from moving to the scaled-up nodes. In that case, at the time the snapshot starts all the shards are in their desired locations so they'll go straight through to INIT.

Yeah it is intentional that this PR does not fix this problem. It tries to fix the other problem where continuous shard snapshots leave no quiet time for the shard to relocate. That is, it is an alternative to limiting max concurrent snapshots. It helps to delay a shard snapshot when the cluster is unbalanced and the shard is on undesired allocation. So the shard snapshot waits for relocation to happen first instead of locking it down regardless like what we have now.

I think this case is kinda orthogonal because with bounded INIT shards, it is still theoretically possible for continuous snapshots locking those in INIT state for too long. I feel this could be improvement for the current situation though it does not fix it entirely?

DaveCTurner · 2025-06-30T06:03:33Z

That's true, but still we have seen cases where even a single snapshot takes several hours to complete. The later snapshots aren't really the problem here, it's the first one that we need to address. Indeed, the reason why this first snapshot takes so long appears to be due to the cluster scaling down after a spike in indexing, leaving it with severely restricted snapshot capacity. We need to find a way to allow the cluster to scale back up again in that case, even if it's only one snapshot running.

If we limited the number of shards in state INIT then we wouldn't need this extra complexity.

ywangd · 2025-06-30T06:34:52Z

a single snapshot takes several hours to complete

My thinking is that individual shard snapshot may not take all the time to complete. As they complete, they can be free to move with this change. But today they will be locked down again by a second snapshot. That indeed means the single shard snapshot ran for more than 30min which is an issue this PR does not fix. In another word, it does not fix the initial slowness but can prevent it from deteriorating.

Waiting for desired allocation before starting shard snapshot

c3baba7

If the allocation of a shard is undesired, its snapshort now waits for the desired allocation.

elasticsearchmachine added the v9.2.0 label Jun 30, 2025

ywangd added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Jun 30, 2025

ywangd requested a review from DaveCTurner June 30, 2025 01:51

Update docs/changelog/130300.yaml

6981a66

DaveCTurner requested changes Jun 30, 2025

View reviewed changes

ywangd closed this Jun 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Waiting for desired allocation before starting shard snapshot #130300

Waiting for desired allocation before starting shard snapshot #130300

Uh oh!

ywangd commented Jun 30, 2025

Uh oh!

elasticsearchmachine commented Jun 30, 2025

Uh oh!

DaveCTurner left a comment •

edited

Loading

Uh oh!

ywangd commented Jun 30, 2025

Uh oh!

DaveCTurner commented Jun 30, 2025 •

edited

Loading

Uh oh!

ywangd commented Jun 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Waiting for desired allocation before starting shard snapshot #130300

Waiting for desired allocation before starting shard snapshot #130300

Uh oh!

Conversation

ywangd commented Jun 30, 2025

Uh oh!

elasticsearchmachine commented Jun 30, 2025

Uh oh!

DaveCTurner left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ywangd commented Jun 30, 2025

Uh oh!

DaveCTurner commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ywangd commented Jun 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DaveCTurner left a comment •

edited

Loading

DaveCTurner commented Jun 30, 2025 •

edited

Loading