Adjust the formula for "adaptive replica selection"#144562
Merged
benchaplin merged 11 commits intoelastic:mainfrom Mar 27, 2026
Merged
Adjust the formula for "adaptive replica selection"#144562benchaplin merged 11 commits intoelastic:mainfrom
benchaplin merged 11 commits intoelastic:mainfrom
Conversation
Collaborator
|
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
Collaborator
|
Hi @benchaplin, I've created a changelog YAML for you. |
jimczi
approved these changes
Mar 19, 2026
Contributor
jimczi
left a comment
There was a problem hiding this comment.
I left one comment regarding the usage of a setting, LGTM otherwise
server/src/main/java/org/elasticsearch/node/ResponseCollectorService.java
Outdated
Show resolved
Hide resolved
jimczi
approved these changes
Mar 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Makes a change to the "C3 ranking formula" used for adaptive replica selection. This change is gated behind a dynamic setting and feature flag. My intention is to test it gradually, first in nightly benchmarks (where it will be enabled immediately due to the feature flag), then in Serverless by overriding the setting. Eventually, the feature flag can be removed and the setting default set to true for stateful release.
This change is motivated by a couple of inconsistencies between the intention of the formula and the current state of search.
ARS consists of 3 stats:
The first term of the C3 ranking formula is: (R - S), a term meant to isolate network overhead and queue time.
Inconsistency 1: Since inter-segment search concurrency was introduced to the query phase (
QUERY_PHASE_PARALLEL_COLLECTION_ENABLED), search threads may now only process a portion of the shard query. That means the value of S may be significantly smaller (it's an exponentially-weighted moving average), and the (R - S) term far overestimates network latency and queueing.Inconsistency 2: Batched query execution (
BATCHED_QUERY_PHASE) complicates this further. If multiple shards are batched into the same transport request, the "total response time" R will be for the entire batch, making (R - S) further overestimate network latency and queueing.In the future, it might be best to separate "service time" into two values:
Then adjust the formula to: (R - S_total) + f(S_q, q). This would bring network overhead costs back into the equation. But for now, this PR is a quicker way forward.