Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Aug 5, 2025

| STATS FIRST(v BY @timestamp) BY hostname

Related to #108385

```
| STATS FIRST(v BY @timestamp) BY hostname
```

Related to elastic#108385
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 5, 2025
functionExpression
: functionName LP (ASTERISK | (booleanExpression (COMMA booleanExpression)* (COMMA mapExpression)?))? RP
: functionName LP (ASTERISK | (booleanExpression (COMMA booleanExpression)* (COMMA mapExpression)?))? RP #functionStandard
| (FIRST | LAST) LP value=booleanExpression BY by=booleanExpression RP #functionFirstLast
Copy link
Contributor

@idegtiarenko idegtiarenko Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why value and by are boolean expressions? I would expect something about column identifier.

Also I am a bit worried about making a custom syntax for each function.
I understand the workaround for FIRST and LAST, however I am not sure if we should also have BY in it opposed to a simpler coma separated list of parameters.
I am imagining something like:

functionExpression
    : name=(functionName | FIRST | LAST) LP (ASTERISK | (booleanExpression (COMMA booleanExpression)* (COMMA mapExpression)?))? RP

I feel like #132469 could be a safer option

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach, except for the BY inside the function - see below.

functionExpression
: functionName LP (ASTERISK | (booleanExpression (COMMA booleanExpression)* (COMMA mapExpression)?))? RP
: functionName LP (ASTERISK | (booleanExpression (COMMA booleanExpression)* (COMMA mapExpression)?))? RP #functionStandard
| (FIRST | LAST) LP value=booleanExpression BY by=booleanExpression RP #functionFirstLast
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should work that we specifically allow (FIRST | LAST) in place of the function name.

I think we shouldn't introduce the BY part as it would change how our functions work; we normally don't have special keywords inside function arguments - normally that is reserved for operators.

I'd start with FIRST(v, @timestamp) to implement the function, and would put syntactic sugar (FIRST(v BY @timestamp)) into a separate PR, as we'd need to see how this fits into the language as a whole.

I think a straight-forward way would be to adjust functionName below and keep everything else the same:

functionName
    : identifierOrParameter
    | keywordsAllowedAsFunctionNames
    ;

keywordsAllowedAsFunctionNames:
    FIRST
    | LAST
    ;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once FIRST and LAST get enabled (in SNAPSHOT), we also best test that they can be properly used as a query parameter (although keywords do not clash with parameters):

curl -u elastic:password -H "Content-Type: application/json" "127.0.0.1:9200/_query?format=txt" -d '
{
  "query": "row x = 1 | stats y = ?foo(x, @timestamp)", "params": [{"foo": {"identifier": "first"}}]
}'

@nik9000 nik9000 closed this Aug 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >refactoring Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants