Skip to content

Conversation

@ioanatia
Copy link
Contributor

@ioanatia ioanatia commented Oct 1, 2025

Initially we passed scorer.leafReaderContext().reader().maxDoc() to the bulk scorer, which showed great improvements.
However this will also keep the current thread busy until it is able to score all the docs from the current reader.
There are ways we can improve this, but they would require more changes.

For now I wanted to push a quick fix to mitigate some of the regressions we see, with the maxPageSize becoming smaller after #108412 .
We should still see a significant performance boost just by scoring docs in batches of 4096 docs, instead of maxPageSize.

@ioanatia ioanatia added >non-issue Team:Search - Relevance The Search organization Search Relevance team v9.2.0 :Search Relevance/ES|QL Search functionality in ES|QL labels Oct 1, 2025
@ioanatia ioanatia changed the title ES|QL: Pass max doc to scorer instead of maxPageSize ES|QL: Pass fix size instead of maxPageSize to LuceneTopNOperator scorer Oct 2, 2025
@ioanatia ioanatia added auto-backport Automatically create backport pull requests when merged v9.2.0 labels Oct 2, 2025
@ioanatia ioanatia marked this pull request as ready for review October 2, 2025 10:51
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 2, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine removed the Team:Search - Relevance The Search organization Search Relevance team label Oct 2, 2025
}
var leafCollector = perShardCollector.getLeafCollector(scorer.leafReaderContext());
scorer.scoreNextRange(leafCollector, scorer.leafReaderContext().reader().getLiveDocs(), maxPageSize);
scorer.scoreNextRange(leafCollector, scorer.leafReaderContext().reader().getLiveDocs(), NUM_DOCS_INTERVAL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to lock this to the max of the range?

It looks like CancellableBulkScorer makes this bigger and bigger with time. But I think this is good and we can get it in and iterate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do something similar here, at least we try to avoid overflows:

void scoreNextRange(LeafCollector collector, Bits acceptDocs, int numDocs) throws IOException {
assert isDone() == false : "scorer is exhausted";
// avoid overflow and limit the range
numDocs = Math.min(maxPosition - position, numDocs);
assert numDocs > 0 : "scorer was exhausted";
position = bulkScorer.score(collector, acceptDocs, position, Math.min(maxPosition, position + numDocs));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. got it.

@ioanatia ioanatia requested a review from nik9000 October 2, 2025 13:46
Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but consider labeling with >bug and creating a release note entry as this would be impacting serverless performance?

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome finding! 🚤

@ioanatia ioanatia added >bug and removed >non-issue labels Oct 2, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @ioanatia, I've created a changelog YAML for you.

}
var leafCollector = perShardCollector.getLeafCollector(scorer.leafReaderContext());
scorer.scoreNextRange(leafCollector, scorer.leafReaderContext().reader().getLiveDocs(), maxPageSize);
scorer.scoreNextRange(leafCollector, scorer.leafReaderContext().reader().getLiveDocs(), NUM_DOCS_INTERVAL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. got it.

@ioanatia ioanatia merged commit 1521291 into elastic:main Oct 2, 2025
34 of 35 checks passed
@ioanatia ioanatia deleted the bulk_scorer_fix branch October 2, 2025 18:36
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
9.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >bug :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants