Avoid hoarding cluster state references during rollover #124107

nielsbauman · 2025-03-05T15:16:33Z

By keeping a list of all the rollover results in a rollover request batch, we were keeping references to all the intermediate cluster states that we built. We've seen this list take up ~1.4GB with 600 rollover requests in one batch.

We only kept the list of results to compute the "reason" for the allocation reroute, so we can easily drop the cluster state reference from the list and only keep what we need.

Fixes #123893

By keeping a list of all the rollover results in a rollover request batch, we were keeping references to all the intermediate cluster states that we built. We've seen this list take up ~1.4GB with 600 rollover requests in one batch. We only kept the list of results to compute the "reason" for the allocation reroute, so we can easily drop the cluster state reference from the list and only keep what we need. Fixes elastic#123893

elasticsearchmachine · 2025-03-05T15:17:11Z

Pinging @elastic/es-data-management (Team:Data Management)

elasticsearchmachine · 2025-03-05T15:17:11Z

Hi @nielsbauman, I've created a changelog YAML for you.

dakrone

LGTM, thanks for fixing this Niels!

DaveCTurner

I wonder if we could come up with a variant of Strings.collectionToDelimitedStringWithLimit that builds the string incrementally. There's no need to keep all the result strings here when we're only keeping the first 1000 characters.

nielsbauman · 2025-03-06T17:34:35Z

I wonder if we could come up with a variant of Strings.collectionToDelimitedStringWithLimit that builds the string incrementally. There's no need to keep all the result strings here when we're only keeping the first 1000 characters.

I'm sure there is something we can come up with. I don't think it'd be necessary for this PR/bug as the cost is pretty limited now. But ideally, it indeed wouldn't hurt to have something like that.

elasticsearchmachine · 2025-03-06T17:37:37Z

💔 Backport failed

Status	Branch	Result
❌	8.18	Commit could not be cherrypicked due to conflicts
❌	8.x	Commit could not be cherrypicked due to conflicts
✅	9.0

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 124107

By keeping a list of all the rollover results in a rollover request batch, we were keeping references to all the intermediate cluster states that we built. We've seen this list take up ~1.4GB with 600 rollover requests in one batch. We only kept the list of results to compute the "reason" for the allocation reroute, so we can easily drop the cluster state reference from the list and only keep what we need. Fixes elastic#123893

…24257) By keeping a list of all the rollover results in a rollover request batch, we were keeping references to all the intermediate cluster states that we built. We've seen this list take up ~1.4GB with 600 rollover requests in one batch. We only kept the list of results to compute the "reason" for the allocation reroute, so we can easily drop the cluster state reference from the list and only keep what we need. Fixes #123893

nielsbauman · 2025-03-06T18:59:00Z

💚 All backports created successfully

Status	Branch	Result
✅	8.x
✅	8.18
✅	8.17

Questions ?

Please refer to the Backport tool documentation

By keeping a list of all the rollover results in a rollover request batch, we were keeping references to all the intermediate cluster states that we built. We've seen this list take up ~1.4GB with 600 rollover requests in one batch. We only kept the list of results to compute the "reason" for the allocation reroute, so we can easily drop the cluster state reference from the list and only keep what we need. Fixes elastic#123893 (cherry picked from commit ff6465b) # Conflicts: # server/src/main/java/org/elasticsearch/action/admin/indices/rollover/LazyRolloverAction.java

… (#124265) # Backport This will backport the following commits from `main` to `8.x`: - [Avoid hoarding cluster state references during rollover (#124107)](#124107)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)

…) (#124266) # Backport This will backport the following commits from `main` to `8.18`: - [Avoid hoarding cluster state references during rollover (#124107)](#124107)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)

…) (#124267) # Backport This will backport the following commits from `main` to `8.17`: - [Avoid hoarding cluster state references during rollover (#124107)](#124107)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)

By keeping a list of all the rollover results in a rollover request batch, we were keeping references to all the intermediate cluster states that we built. We've seen this list take up ~1.4GB with 600 rollover requests in one batch. We only kept the list of results to compute the "reason" for the allocation reroute, so we can easily drop the cluster state reference from the list and only keep what we need. Fixes elastic#123893

nielsbauman added >bug :Data Management/Indices APIs APIs to create and manage indices and templates Team:Data Management Meta label for data/management team auto-backport Automatically create backport pull requests when merged v8.18.1 v8.19.0 v9.0.1 v9.1.0 labels Mar 5, 2025

Update docs/changelog/124107.yaml

eb939e9

dakrone approved these changes Mar 5, 2025

View reviewed changes

DaveCTurner reviewed Mar 5, 2025

View reviewed changes

nielsbauman merged commit ff6465b into elastic:main Mar 6, 2025
17 checks passed

nielsbauman deleted the fix-rollover-oome branch March 6, 2025 17:34

nielsbauman mentioned this pull request Mar 6, 2025

[9.0] Avoid hoarding cluster state references during rollover (#124107) #124257

Merged

elasticsearchmachine added the backport pending label Mar 6, 2025

nielsbauman added the v8.17.4 label Mar 6, 2025

This was referenced Mar 6, 2025

[8.x] Avoid hoarding cluster state references during rollover (#124107) #124265

Merged

[8.18] Avoid hoarding cluster state references during rollover (#124107) #124266

Merged

[8.17] Avoid hoarding cluster state references during rollover (#124107) #124267

Merged

nielsbauman removed the backport pending label Nov 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid hoarding cluster state references during rollover #124107

Avoid hoarding cluster state references during rollover #124107

Uh oh!

nielsbauman commented Mar 5, 2025

Uh oh!

elasticsearchmachine commented Mar 5, 2025

Uh oh!

elasticsearchmachine commented Mar 5, 2025

Uh oh!

dakrone left a comment

Uh oh!

DaveCTurner left a comment

Uh oh!

nielsbauman commented Mar 6, 2025

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 6, 2025

Uh oh!

nielsbauman commented Mar 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Avoid hoarding cluster state references during rollover #124107

Avoid hoarding cluster state references during rollover #124107

Uh oh!

Conversation

nielsbauman commented Mar 5, 2025

Uh oh!

elasticsearchmachine commented Mar 5, 2025

Uh oh!

elasticsearchmachine commented Mar 5, 2025

Uh oh!

dakrone left a comment

Choose a reason for hiding this comment

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

nielsbauman commented Mar 6, 2025

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 6, 2025

💔 Backport failed

Uh oh!

nielsbauman commented Mar 6, 2025

💚 All backports created successfully

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants