-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Fix early termination in LuceneSourceOperator #123197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @dnhatn, I've created a changelog YAML for you. |
|
|
||
| public void testEarlyTermination() { | ||
| int size = between(1_000, 20_000); | ||
| int limit = between(10, size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, max in between is exclusive and limit is always strictly less than size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Limit can be larger than the size.
|
@idegtiarenko Thanks for reviewing. |
You can use sqren/backport to manually backport by running |
The LuceneSourceOperator is supposed to terminate when it reaches the limit; unfortunately, we don't have a test to cover this. Due to this bug, we continue scanning all segments, even though we discard the results as the limit was reached. This can cause performance issues for simple queries like FROM .. | LIMIT 10, when Lucene indices are on the warm or cold tier. I will submit a follow-up PR to ensure we only collect up to the limit across multiple drivers.
The LuceneSourceOperator is supposed to terminate when it reaches the limit; unfortunately, we don't have a test to cover this. Due to this bug, we continue scanning all segments, even though we discard the results as the limit was reached. This can cause performance issues for simple queries like FROM .. | LIMIT 10, when Lucene indices are on the warm or cold tier. I will submit a follow-up PR to ensure we only collect up to the limit across multiple drivers.
The LuceneSourceOperator is supposed to terminate when it reaches the limit; unfortunately, we don't have a test to cover this. Due to this bug, we continue scanning all segments, even though we discard the results as the limit was reached. This can cause performance issues for simple queries like FROM .. | LIMIT 10, when Lucene indices are on the warm or cold tier. I will submit a follow-up PR to ensure we only collect up to the limit across multiple drivers.
The LuceneSourceOperator is supposed to terminate when it reaches the limit; unfortunately, we don't have a test to cover this. Due to this bug, we continue scanning all segments, even though we discard the results as the limit was reached. This can cause performance issues for simple queries like FROM .. | LIMIT 10, when Lucene indices are on the warm or cold tier. I will submit a follow-up PR to ensure we only collect up to the limit across multiple drivers.
The LuceneSourceOperator is supposed to terminate when it reaches the limit; unfortunately, we don't have a test to cover this. Due to this bug, we continue scanning all segments, even though we discard the results as the limit was reached. This can cause performance issues for simple queries like FROM .. | LIMIT 10, when Lucene indices are on the warm or cold tier. I will submit a follow-up PR to ensure we only collect up to the limit across multiple drivers.
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
The LuceneSourceOperator is supposed to terminate when it reaches the limit; unfortunately, we don't have a test to cover this. Due to this bug, we continue scanning all segments, even though we discard the results as the limit was reached. This can cause performance issues for simple queries like FROM .. | LIMIT 10, when Lucene indices are on the warm or cold tier. I will submit a follow-up PR to ensure we only collect up to the limit across multiple drivers. (cherry picked from commit 4d2b8dc)
The LuceneSourceOperator is supposed to terminate when it reaches the limit; unfortunately, we don't have a test to cover this. Due to this bug, we continue scanning all segments, even though we discard the results as the limit was reached. This can cause performance issues for simple queries like FROM .. | LIMIT 10, when Lucene indices are on the warm or cold tier. I will submit a follow-up PR to ensure we only collect up to the limit across multiple drivers.
nik9000
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dnhatn !
The LuceneSourceOperator is supposed to terminate when it reaches the limit; unfortunately, we don't have a test to cover this. Due to this bug, we continue scanning all segments, even though we discard the results as the limit was reached. This can cause performance issues for simple queries like
FROM .. | LIMIT 10, when Lucene indices are on the warm or cold tier. I will submit a follow-up PR to ensure we only collect up to the limit across multiple drivers.