-
Notifications
You must be signed in to change notification settings - Fork 25.7k
ESQL: Block loader for pushing LENGTH #137217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Creates a `BlockLoader` for pushing the `LENGTH` function down into the loader for `keyword` fields. It takes advantage of the terms dictionary so we only need to calculate the code point count once per unique term loaded. This `BlockLoader` implementation isn't plugged into the infrastructure for emitting it because we're waiting on the infrastructure we've started in elastic#137002. We'll make a follow up PR to plug this in. We're doing this mostly to demonstrate another function that we can push into field loading, in addition to the vector similarity functions we're building in elastic#137002. We don't expect `LENGTH` to be a super hot function. If it happens to be then this'll help. Before we plug this in we'll have to figure out emitting warnings from functions that we've fused to field loading. Because `LENGTH` can emit a warning, specifically when it hits a multivalued field.
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
dnhatn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In cases with large indices and many ordinals - such as an index with 10M documents and 10K ordinals - it might be more efficient to look up ordinals in order. However, this isn't a big concern. This looks great. Thank you, Nik!
I think what you are saying is "this is fine, but the |
Creates a `BlockLoader` for pushing the `LENGTH` function down into the loader for `keyword` fields. It takes advantage of the terms dictionary so we only need to calculate the code point count once per unique term loaded. This `BlockLoader` implementation isn't plugged into the infrastructure for emitting it because we're waiting on the infrastructure we've started in elastic#137002. We'll make a follow up PR to plug this in. We're doing this mostly to demonstrate another function that we can push into field loading, in addition to the vector similarity functions we're building in elastic#137002. We don't expect `LENGTH` to be a super hot function. If it happens to be then this'll help. Before we plug this in we'll have to figure out emitting warnings from functions that we've fused to field loading. Because `LENGTH` can emit a warning, specifically when it hits a multivalued field.
Creates a
BlockLoaderfor pushing theLENGTHfunction down into the loader forkeywordfields. It takes advantage of the terms dictionary so we only need to calculate the code point count once per unique term loaded.This
BlockLoaderimplementation isn't plugged into the infrastructure for emitting it because we're waiting on the infrastructure we've started in #137002. We'll make a follow up PR to plug this in.We're doing this mostly to demonstrate another function that we can push into field loading, in addition to the vector similarity functions we're building in #137002. We don't expect
LENGTHto be a super hot function. If it happens to be then this'll help.Before we plug this in we'll have to figure out emitting warnings from functions that we've fused to field loading. Because
LENGTHcan emit a warning, specifically when it hits a multivalued field.