-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Support index pattern selector syntax in ES|QL #120660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Pinging @elastic/es-data-management (Team:Data Management) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick drive by comments - I'll let @fang-xing-esql drive this. Thanks!
...ugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/EsqlActionIT.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/IdentifierBuilder.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jbaiera, thank you for adding this to ES|QL.
I've left a minor comment regarding exception wrapping.
Also, does testing ::
with a Security-enabled cluster have any particularities, or the selector should just work?
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/IdentifierBuilder.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for enabling selector in ES|QL
@jbaiera! I added something that I can think of so far, they are mainly around where do we expect the selector to work.
The index pattern is allowed in multiple commands/places in the ES|QL
statements, such as from
, metrics
, join
etc. and plus CCS, -
(exclusion), *
(wildcard), there is a good amount of combinations of these factors. It will be nice to have tests to cover some more combinations of different scenarios where selector is expected to work or not work, so that we have a better idea of the scope of the change.
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/StatementParserTests.java
Outdated
Show resolved
Hide resolved
|
||
indexPattern | ||
: (clusterString COLON)? indexString | ||
| {this.isDevVersion()}? (clusterString COLON)? indexString (CAST_OP selectorString)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indexPattern
is used in a few other places like joinCommand
and metricsCommand
etc. Do we expect the selector work in the other commands besides fromCommand
? If we don't want to support it under the other command, we can put the restrictions in LogicalPlanBuilder
, and it will be great to have negative/positive tests to cover the them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to say that this should work in the joinCommand
, but I don't know if there are any functional restrictions on that command that might bar the pattern's usage. I'm not sure I know what the metricsCommand
actually is and what the difference is between it and the fromCommand
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both joinCommand
and metricsCommand
are under snapshot today.
Here are something that I can think of that might affect ::
, that we need to validate, the best way to validate them is to test them.
*
is not supported on the right hand side of the join, however it is a valid::
option
var rightPattern = visitIndexPattern(List.of(target.index));
if (rightPattern.contains(WILDCARD)) {
throw new ParsingException(source(target), "invalid index pattern [{}], * is not allowed in LOOKUP JOIN", rightPattern);
}
- metrics is mainly for tsdb, I'm not quite sure if
::
is relevant to tsdb honestly, if it is not relevant we can error out early in parsing, otherwise we may also want to have tests to show how it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the reason that wildcards are banned from the join command because they resolve to multiple indices? I can say for sure that a selector of ::*
has a good chance of resolving to multiple indices. That said, putting any data stream name in the right hand side of the join operation will likely resolve to multiple backing indices, much like an alias pointing to multiple indices would. How do we handle aliases and data streams on the right hand side of the join today?
As for metrics and TSDB - the failure store doesn't surface any time series features like a time series data stream might. A time series data stream can have a failure store associated with it, but its failure store doesn't exhibit any of the TSDB functionality. If the expectation is that the metrics command is only ever used with TSDB, I don't know if it makes sense to allow selectors in it for the time being, since the only valid selectors are likely ::data
in that case.
...ugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/EsqlActionIT.java
Show resolved
Hide resolved
; | ||
|
||
selectorString | ||
: UNQUOTED_SOURCE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we allow quoting selector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to simply follow how the definition for the clusterString are laid out. I don't have too much of an opinion on quoting the selector or not. The selector contents can only ever be data
, failures
, or *
. I don't know what quoting them separately from the rest of the index pattern might buy us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jbaiera! I added some comments that I can think of so far.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/IdentifierBuilder.java
Outdated
Show resolved
Hide resolved
var joinPattern = randomIndexPattern(without(CROSS_CLUSTER), without(WILDCARD_PATTERN), without(INDEX_SELECTOR)); | ||
joinPattern = maybeQuote(joinPattern + "::garbage"); | ||
expectError( | ||
"FROM " + randomIndexPatterns() + " | JOIN " + joinPattern + " ON " + randomIdentifier(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message looks a bit unclear. I wonder if it is possible to get a message says that ""selector is not supported on the right hand side of a lookup join"? And why using without(INDEX_SELECTOR)
with maybeQuote(joinPattern + "::garbage")
, instead of just calling randomIndexPattern(without(CROSS_CLUSTER), without(WILDCARD_PATTERN), INDEX_SELECTOR)
? And it seems the left hand side allows both clusterString
and selectorString
, which may add some variations of the error message? If it does, perhaps test the left hand side and right hand side of a lookup join separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message looks a bit unclear. I wonder if it is possible to get a message says that ""selector is not supported on the right hand side of a lookup join"?
This seems to be something that is generated as part of the antlr parsing - I'm not sure how to override the error returned from antlr in this case with something more helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And why using without(INDEX_SELECTOR) with maybeQuote(joinPattern + "::garbage"), instead of just calling randomIndexPattern(without(CROSS_CLUSTER), without(WILDCARD_PATTERN), INDEX_SELECTOR)?
I'll fix that up - I think that's left behind from when I was validating that incorrect selectors are rejected from join operations. Now all selectors are rejected from join operations.
And it seems the left hand side allows both clusterString and selectorString, which may add some variations of the error message? If it does, perhaps test the left hand side and right hand side of a lookup join separate.
I'll try and capture a couple of different scenarios for the left side of the query to make sure it's failing with expected messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added additional cases and clean up with 6d5b049
if (EsqlCapabilities.Cap.INDEX_COMPONENT_SELECTORS.isEnabled()) { | ||
assertStringAsIndexPattern("foo::data", command + " foo::data"); | ||
assertStringAsIndexPattern("foo::failures", command + " foo::failures"); | ||
assertStringAsIndexPattern("cluster:foo::failures", command + " cluster:\"foo::failures\""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I might miss something... should this fail, if selector does not work with remote cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the problem with the validation logic as laid out by #122497 and #122651 - Because the cluster is outside of the quotation marks we parse it as normal, but the content inside of the quotation marks is never parsed or checked because remote cluster resources are not validated locally.
validateClusterString(clusterString, c); | ||
// Do not allow selectors on remote cluster expressions until they are supported | ||
if (selectorString != null) { | ||
InvalidIndexNameException ie = new InvalidIndexNameException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem like this code path will be exercised, EsqlBaseParser.g4
does not allow clusterString
coexist with selectorString
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean it should be removed?
clusterAndIndexAsIndexPattern(command, "cluster*:*"); | ||
clusterAndIndexAsIndexPattern(command, "*:index*"); | ||
clusterAndIndexAsIndexPattern(command, "*:*"); | ||
if (EsqlCapabilities.Cap.INDEX_COMPONENT_SELECTORS.isEnabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have tests with *
as selector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*
is no longer a supported selector value as of #121900
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/StatementParserTests.java
Show resolved
Hide resolved
...ugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/EsqlActionIT.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jbaiera, LGTM!
pattern = maybeQuote(cluster + ":" + pattern); | ||
} else if (EsqlCapabilities.Cap.INDEX_COMPONENT_SELECTORS.isEnabled() && canAdd(Features.INDEX_SELECTOR, features)) { | ||
var selector = ESTestCase.randomFrom(IndexComponentSelector.values()); | ||
pattern = maybeQuote(pattern + "::" + selector.getKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above in the thread, cross cluster is independent from index selectors.
Could you please convert if else
to if
?
This PR updates the ES|QL grammar to include the selector portion of an index pattern. Patterns are recombined before being sent to field caps. Field caps already supports this functionality, so this is primarily wiring it up where needed.
FROM_OPENING_BRACKET : OPENING_BRACKET -> type(OPENING_BRACKET); | ||
FROM_CLOSING_BRACKET : CLOSING_BRACKET -> type(CLOSING_BRACKET); | ||
FROM_COLON : COLON -> type(COLON); | ||
FROM_SELECTOR : {this.isDevVersion()}? CAST_OP -> type(CAST_OP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks unused. indexPattern
appears to be relying on CAST_OP directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parser does use the cast op directly, but without this entry in the lexicon the parsing logic rejects the ::
character with an error like line 1:10: extraneous input ':' expecting {QUOTED_STRING, UNQUOTED_SOURCE}
* Support index pattern selector syntax in ES|QL (#120660) This PR updates the ES|QL grammar to include the selector portion of an index pattern. Patterns are recombined before being sent to field caps. Field caps already supports this functionality, so this is primarily wiring it up where needed. * Add validation step to quoted join query construction (#125731) * Add validation step to quoted join query construction * Fix double validation * Add lexer entry for METRICS commands * Simplifying test case * regen
This PR updates the ES|QL grammar to include the selector portion of an index pattern. Patterns are recombined before being sent to field caps. Field caps already supports this functionality, so this is primarily wiring it up where needed.
Selector syntax (#118614) was introduced as part of the work to support implementing failure stores in data streams. This is a new feature of index patterns which allows a user to specify which indices inside of a data stream should be used in an action. Selectors are denoted by using the
::
separator between a data stream name and the component of the data stream the user wants to target for an operation.To search a data stream's backing indices only, the
::data
selector is used:To search a data stream's failure indices only, the
::failures
selector is used:These selectors support being mixed with other index pattern syntax as well
By default, when a data stream has no selector specified, the
::data
selector is implied to maintain backwards compatibility with current search functionality. The::data
selector primarily exists as a way to explicitly select the backing indices, but is not required for normal usage.This PR updates the ES|QL grammar to include the selector portion of an index pattern. Patterns are recombined before being sent to field caps. Field caps already supports this functionality, so this is primarily wiring it up where needed.