Skip to content

Adjust the formula for "adaptive replica selection"#144562

Merged
benchaplin merged 11 commits intoelastic:mainfrom
benchaplin:adjust_ars_formula
Mar 27, 2026
Merged

Adjust the formula for "adaptive replica selection"#144562
benchaplin merged 11 commits intoelastic:mainfrom
benchaplin:adjust_ars_formula

Conversation

@benchaplin
Copy link
Contributor

Makes a change to the "C3 ranking formula" used for adaptive replica selection. This change is gated behind a dynamic setting and feature flag. My intention is to test it gradually, first in nightly benchmarks (where it will be enabled immediately due to the feature flag), then in Serverless by overriding the setting. Eventually, the feature flag can be removed and the setting default set to true for stateful release.

This change is motivated by a couple of inconsistencies between the intention of the formula and the current state of search.

ARS consists of 3 stats:

  • R: total response time - from the coordinator's perspective, including network latency, queueing etc.
  • S: service time - the time it takes for the search thread to do the job
  • q: the size of the queue

The first term of the C3 ranking formula is: (R - S), a term meant to isolate network overhead and queue time.

Inconsistency 1: Since inter-segment search concurrency was introduced to the query phase (QUERY_PHASE_PARALLEL_COLLECTION_ENABLED), search threads may now only process a portion of the shard query. That means the value of S may be significantly smaller (it's an exponentially-weighted moving average), and the (R - S) term far overestimates network latency and queueing.

Inconsistency 2: Batched query execution (BATCHED_QUERY_PHASE) complicates this further. If multiple shards are batched into the same transport request, the "total response time" R will be for the entire batch, making (R - S) further overestimate network latency and queueing.


In the future, it might be best to separate "service time" into two values:

  • S_total: the time it takes the data node to process the query (batched or single-shard) including queue time
  • S_q: the time it takes for a thread to complete a search task (entire shard or slice)

Then adjust the formula to: (R - S_total) + f(S_q, q). This would bring network overhead costs back into the equation. But for now, this PR is a quicker way forward.

@benchaplin benchaplin requested a review from a team as a code owner March 19, 2026 13:06
@benchaplin benchaplin added >bug Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch :Search Foundations/Search Catch all for Search Foundations labels Mar 19, 2026
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine
Copy link
Collaborator

Hi @benchaplin, I've created a changelog YAML for you.

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one comment regarding the usage of a setting, LGTM otherwise

@benchaplin benchaplin merged commit ef51c81 into elastic:main Mar 27, 2026
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants