-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ESQL: Add syntax for FIRST and LAST #132474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
``` | STATS FIRST(v BY @timestamp) BY hostname ``` Related to elastic#108385
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
| functionExpression | ||
| : functionName LP (ASTERISK | (booleanExpression (COMMA booleanExpression)* (COMMA mapExpression)?))? RP | ||
| : functionName LP (ASTERISK | (booleanExpression (COMMA booleanExpression)* (COMMA mapExpression)?))? RP #functionStandard | ||
| | (FIRST | LAST) LP value=booleanExpression BY by=booleanExpression RP #functionFirstLast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why value and by are boolean expressions? I would expect something about column identifier.
Also I am a bit worried about making a custom syntax for each function.
I understand the workaround for FIRST and LAST, however I am not sure if we should also have BY in it opposed to a simpler coma separated list of parameters.
I am imagining something like:
functionExpression
: name=(functionName | FIRST | LAST) LP (ASTERISK | (booleanExpression (COMMA booleanExpression)* (COMMA mapExpression)?))? RP
I feel like #132469 could be a safer option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach, except for the BY inside the function - see below.
| functionExpression | ||
| : functionName LP (ASTERISK | (booleanExpression (COMMA booleanExpression)* (COMMA mapExpression)?))? RP | ||
| : functionName LP (ASTERISK | (booleanExpression (COMMA booleanExpression)* (COMMA mapExpression)?))? RP #functionStandard | ||
| | (FIRST | LAST) LP value=booleanExpression BY by=booleanExpression RP #functionFirstLast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should work that we specifically allow (FIRST | LAST) in place of the function name.
I think we shouldn't introduce the BY part as it would change how our functions work; we normally don't have special keywords inside function arguments - normally that is reserved for operators.
I'd start with FIRST(v, @timestamp) to implement the function, and would put syntactic sugar (FIRST(v BY @timestamp)) into a separate PR, as we'd need to see how this fits into the language as a whole.
I think a straight-forward way would be to adjust functionName below and keep everything else the same:
functionName
: identifierOrParameter
| keywordsAllowedAsFunctionNames
;
keywordsAllowedAsFunctionNames:
FIRST
| LAST
;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once FIRST and LAST get enabled (in SNAPSHOT), we also best test that they can be properly used as a query parameter (although keywords do not clash with parameters):
curl -u elastic:password -H "Content-Type: application/json" "127.0.0.1:9200/_query?format=txt" -d '
{
"query": "row x = 1 | stats y = ?foo(x, @timestamp)", "params": [{"foo": {"identifier": "first"}}]
}'
Related to #108385