Skip to content

Conversation

@mosche
Copy link
Contributor

@mosche mosche commented Mar 16, 2025

Backports the following commits to 9.0:

…h core pool size = 0 (elastic#124732)

When `ExecutorScalingQueue` rejects work to make the worker pool scale up while already being at max pool size (and a new worker consequently cannot be added), available workers might timeout just about at the same time as the task is then force queued by `ForceQueuePolicy`. This has caused starvation of work as observed for `masterService#updateTask` in elastic#124667 where max pool size 1 is used. This configuration is most likely to expose the bug.

This PR changes `EsExecutors.newScaling` to not use `ExecutorScalingQueue` if max pool size is 1 (and core pool size is 0). A regular `LinkedTransferQueue` works perfectly fine in this case.

If max pool size > 1, a probing approach is used to ensure the worker pool is adequately scaled to at least 1 worker after force queueing work in `ForceQueuePolicy`.

Fixes elastic#124667
Relates to elastic#18613
@mosche mosche added the :Core/Infra/Core Core issues without another label label Mar 16, 2025
@mosche mosche requested a review from a team as a code owner March 16, 2025 16:44
@mosche mosche added >bug auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport Team:Core/Infra Meta label for core/infra team labels Mar 16, 2025
@elasticsearchmachine elasticsearchmachine merged commit 542f83a into elastic:9.0 Mar 16, 2025
16 checks passed
@mosche mosche deleted the backport/9.0/pr-124732 branch March 16, 2025 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport >bug :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team v9.0.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants