Skip to content

Conversation

@Tim-Brooks
Copy link
Contributor

With the rise of larger CPU count nodes our current write queue size
might be too conservative. Indexing pressure will still provide protect
against out of memories.

With the rise of larger CPU count nodes our current write queue size
might be too conservative. Indexing pressure will still provide protect
against out of memories.
@Tim-Brooks Tim-Brooks requested a review from a team as a code owner June 26, 2025 03:13
@Tim-Brooks Tim-Brooks added :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v9.1.0 labels Jun 26, 2025
@elasticsearchmachine elasticsearchmachine added Team:Distributed Indexing Meta label for Distributed Indexing team v9.2.0 labels Jun 26, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@github-actions
Copy link
Contributor

github-actions bot commented Jun 26, 2025

🔍 Preview links for changed docs:

🔔 The preview site may take up to 3 minutes to finish building. These links will become live once it completes.

@Tim-Brooks Tim-Brooks merged commit 1d3bd46 into elastic:main Jun 26, 2025
32 checks passed
smalyshev pushed a commit to smalyshev/elasticsearch that referenced this pull request Jun 27, 2025
With the rise of larger CPU count nodes our current write queue size
might be too conservative. Indexing pressure will still provide protect
against out of memories.
@nicpenning
Copy link
Contributor

Curtis, how does this apply to current clusters that are heavy CPU allocated?

For example, 7 nodes at 16 CPUs each. Is 10k still the best or would this new setting open up that 10k limit and be a larger value?

Just trying to understand the logic here because we have been told over and over again that increasing this 10k write queue is not advisable, so wondering what has changed here.

mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 3, 2025
With the rise of larger CPU count nodes our current write queue size
might be too conservative. Indexing pressure will still provide protect
against out of memories.
@Tim-Brooks
Copy link
Contributor Author

For example, 7 nodes at 16 CPUs each. Is 10k still the best or would this new setting open up that 10k limit and be a larger value?

This setting would make the queue be of size 12,000 for a 16 core CPU. max(10,0000, 750 * cores).

Just trying to understand the logic here because we have been told over and over again that increasing this 10k write queue is not advisable, so wondering what has changed here.

We encountered some internal benchmarks which showed 32 core machines exhausting the thread pool queue but still having manageable latency. Which led us to make a this tweak for larger machines.

We would only advise changing it if a user is advanced and has a clear understanding of their load. I'm sure there are workloads out there with many, many small tasks which exhaust the queue with the node not being backed up. So if some benchmark shows that scenario, then changing the queue might be a way to address that.

But the standard scenario where the queue is consistently exhausted is just under provisioning and increasing the queue will just lead to high write latencies, timeouts, retries, etc.

@nicpenning
Copy link
Contributor

Sounds great, thank you! This does help and I appreciate that breakdown!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >non-issue Team:Distributed Indexing Meta label for Distributed Indexing team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants