Skip to content

Conversation

ioanatia
Copy link
Contributor

@ioanatia ioanatia commented Oct 8, 2025

The LuceneTopNOperator cannot output pages until it collects the top N docs, by scoring all potential matches.

LuceneTopNOperator scores documents in batches - a single call of runSingleLoopIteration from Driver corresponds to scoring a single batch of documents in LuceneTopNOperator with scorer.scoreNextRange(....

In runSingleLoopIteration we iterate through all the operators if they have a page to output so we can feed it to the next operator. This method is called in a loop in Driver#run. We do a lot of extra computation for LuceneTopNOperator to start emitting pages.

What we'd want is to not require multiple calls to LuceneTopNOperator#getOutput to actually output pages, and to be able to emit a page from the first time we call this method.
However we still want to make sure we don't block the execution such that the query cannot be cancelled while LuceneTopNOperator#getOutput runs.
To mitigate this, we pass the DriverContext to the LuceneTopNOperator such that we can check if the query has been cancelled with driverContext.checkForEarlyTermination().

We can do a similar optimization for LuceneCountOperator if we are happy with this one.

@ioanatia ioanatia added >non-issue Team:Search - Relevance The Search organization Search Relevance team :Search Relevance/ES|QL Search functionality in ES|QL v9.3.0 labels Oct 8, 2025
@ioanatia ioanatia requested review from dnhatn and nik9000 October 8, 2025 12:13
@ioanatia ioanatia marked this pull request as ready for review October 8, 2025 12:13
@elasticsearchmachine elasticsearchmachine added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed Team:Search - Relevance The Search organization Search Relevance team labels Oct 8, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me. One question before merging - could this run for, like, minutes? I imagine in some rare cases with runtime fields and no timestamp query it's possible, though unlikely. But in other cases. Like, in anything approximating a "normal" case?

I ask because we don't have a thing that updates the driver status outside of the main loop. So if this run "forever" then we won't see it in the status. I don't think this is a problem here, but it might be worth a comment.

Mostly I'm having flashbacks to that time when I made an agg run for a week.


// If we stayed longer than 1 second to execute getOutput, we should return back to the driver, so we can update its status.
// Even if this should almost never happen, we want to update the driver status even when a query runs "forever".
if (TimeUnit.SECONDS.convert(System.nanoTime() - start, TimeUnit.NANOSECONDS) >= 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd yank this into a constant - maybe even one in nanos. Just a little more expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - makes sense - guess I was in too much of a hurry - addressed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too much a of hurry, just a little thing.

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @ioanatia

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

@ioanatia ioanatia merged commit 22a7491 into elastic:main Oct 8, 2025
34 checks passed
@ioanatia ioanatia deleted the pass_driver_context_to_operator branch October 8, 2025 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants