- 
                Notifications
    
You must be signed in to change notification settings  - Fork 25.6k
 
Disallow CCS with lookup join #120277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disallow CCS with lookup join #120277
Conversation
It is not possible to perform lookup joins with CC at the moment.
| 
           Pinging @elastic/es-analytical-engine (Team:Analytics)  | 
    
| if (join.left() instanceof EsRelation esr) { | ||
| for (var index : esr.index().concreteIndices()) { | ||
| if (index.indexOf(RemoteClusterAware.REMOTE_CLUSTER_INDEX_SEPARATOR) != -1) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd check both left() and right(), just to avoid the grammar being "fixed" and opening this possibility.
Also, there's StringUtils#splitQualifiedIndex()StringUtils#isQualified() that helps with this check, but arguably heavier and maybe not absolutely required.
Both notes as FYI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the right part of the join can be checked in LogicalPlanBuilder where the lookup is built.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current state, I think I'd prefer adding a check right here like Bogdan suggests. But, we could have both for maximum paranoia!
- Check in LogicalPlanBuilder: currently redundant, as the grammar doesn't allow for 
:at all. I'll likely change my mind once we update the grammar. Because if we generalize the latter, a check in LogicalPlanBuilder would require only the parsed query; as such, it can prevent even sending a field caps request in case a user asks for... | LOOKUP JOIN remote:lu_idx .... - Check here, in the Analyzer: using the concrete indices is a good source of definitive truth, as the concrete indices cannot contain aliases.
 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not about paranoia, but about performance: fail fast without even checking the field_caps call(s) or resolve the clusters/indices and then fail in the Analyzer.
This needs a bit of research... there may be situations (at the Verifier level) where the behavior might change if we check things too early. For example, lookup join some_inexistent_remote_cluster:foo - do we complain that a remote lookup index is not supported or that some_inexistent_remote_cluster  doesn't actually exist and the query is incorrect because maybe the user had a typo?
| 
           Also, I would wait for @quux00 take on this because  In the Analyzer, where you are checking this, this information might already tell you if you can reject the query or not, but I am not sure.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @idegtiarenko , I think the PR's approach has a flaw - but one that is easily fixed. I left a comment below.
Additionally, I realize a lot may change once we update the grammar. Maybe it makes sense to come back to this once we use the proper grammar for the lookup index, and keep the PR open until then?
| 
               | 
          ||
| public void testInvalidRemoteLookupJoin() { | ||
| // TODO ES-10559 this should be replaced with a proper error message once grammar allows indexPattern as joinTarget | ||
| expectError("FROM my-index | LOOKUP JOIN remote:languages_lookup ON language_code", "line 1:35: token recognition error at: ':'"); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to say that we should already start adding cases with quoting here, but realized that's kinda silly due to the grammar just not being there yet.
But maybe it makes sense to consider the associated issue blocked by ES-10559, just because we'll need to add more tests once the grammar is generalized. (In a follow-up PR)
I can think of
LOOKUP JOIN remote:"languages_lookup"LOOKUP JOIN "remote:languages_lookup"LOOKUP JOIN """remote:languages_lookup"""LOOKUP JOIN remote:"""languages_lookup"""
and I'm unsure if all of these actually make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am likely going to merge this after #120494 (that is changing the same classes). I will add above variations once resolving the conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since #120494 adds nice options for randomization, we could also use randomization here.
Other test cases we could consider:
- multiple colons
 - colons and wildcards
 - colons and date math - especially because we want to enable date math later, and valid date math expressions themselves can contain colons
 - quoting (
"and""") of the remote name + 1 or multiple colons - just so we can later see how the error message changes from a token recognition error to a proper parsing exception once Ensure cluster string could be quoted #120355 lands. 
| if (join.left() instanceof EsRelation esr) { | ||
| for (var index : esr.index().concreteIndices()) { | ||
| if (index.indexOf(RemoteClusterAware.REMOTE_CLUSTER_INDEX_SEPARATOR) != -1) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current state, I think I'd prefer adding a check right here like Bogdan suggests. But, we could have both for maximum paranoia!
- Check in LogicalPlanBuilder: currently redundant, as the grammar doesn't allow for 
:at all. I'll likely change my mind once we update the grammar. Because if we generalize the latter, a check in LogicalPlanBuilder would require only the parsed query; as such, it can prevent even sending a field caps request in case a user asks for... | LOOKUP JOIN remote:lu_idx .... - Check here, in the Analyzer: using the concrete indices is a good source of definitive truth, as the concrete indices cannot contain aliases.
 
| } | ||
| 
               | 
          ||
| // joining with remote cluster is not supported yet | ||
| if (join.left() instanceof EsRelation esr) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too specific. I think it will not work in case of queries like
FROM remote:idx | EVAL x = 2*some_field | LOOKUP JOIN lu_idx ON key
Because the join's left child will not be an EsRelation. The EsRelation can be many levels down in the tree.
I'd approach this differently:
- We only need to check the plans leaves, and only those with class 
EsRelation. - For all such relations, if any of them has 
IndexMode.LOOKUP, a lookup join happens. - Then, all 
EsRelationsmustn't be remote. We can just iterate over allEsRelations and check their concrete indices. 
Since my suggestion is a more global approach, I'd place this in the Verifier as a separate verification step, rather than here in the Analyzer. Reason: there can be multiple LOOKUP JOINs and we'll be re-checking the EsRelation in the FROM clause every time.
We have tools to easily traverse query plans, collect nodes that are instanceof a certain class, and I think also getting only the plan's leaves. Take a look at the transformDown methods and similar methods in the same class. You can also take a peek at ReplaceMissingFieldWithNull which needs to scan the plan for lookups as well.
| VerificationException e = expectThrows( | ||
| VerificationException.class, | ||
| () -> analyze( | ||
| "FROM remote:my-index | LOOKUP JOIN languages_lookup ON language_code", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add test cases with multiple lookups, and test cases where more stuff happens between the FROM and the LOOKUP JOIN.
| 
           In response to @astefan 's comment: 
 We are now limiting the scope of  In terms of limiting JOINs in the face of a cross-cluster search, I think you have two high level options: 
  | 
    
This reverts commit 5f55608.
# Conflicts: # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/StatementParserTests.java
| 
           Hi @idegtiarenko, I've created a changelog YAML for you.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @idegtiarenko ! The production code LGTM: since you confirmed that aliases cannot point to remote clusters, we know that CCQ happens if and only if any of the index patterns contains a colon outside of a date math expression. And that's what RemoteClusterAware.isRemoteIndexName does.
I do have some more suggestions for testing, though, that I think should be addressed before merging.
        
          
                docs/changelog/120277.yaml
              
                Outdated
          
        
      | pr: 120277 | ||
| summary: Disallow CCS with lookup join | ||
| area: ES|QL | ||
| type: enhancement | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd declare this as non-issue and would remove the changelog. It's not really an enhancement, anyway :)
| @@ -264 +265 @@ | |||
| if (MetadataAttribute.isSupported(id) == false) { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: isn't this redundant, because below the p.forEachUp(UnresolvedRelation.class will walk over the join, too, and inspect the right hand side (which is an unresolved relation), anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below p only references the left side. Possibly it should be renamed to make it a bit more obvious.
We could create a join and execute .forEachUp on it instead.
We might use current approach if we believe distinct messages make sense for left and right side.
| 
               | 
          ||
| public void testInvalidRemoteLookupJoin() { | ||
| // TODO ES-10559 this should be replaced with a proper error message once grammar allows indexPattern as joinTarget | ||
| expectError("FROM my-index | LOOKUP JOIN remote:languages_lookup ON language_code", "line 1:35: token recognition error at: ':'"); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since #120494 adds nice options for randomization, we could also use randomization here.
Other test cases we could consider:
- multiple colons
 - colons and wildcards
 - colons and date math - especially because we want to enable date math later, and valid date math expressions themselves can contain colons
 - quoting (
"and""") of the remote name + 1 or multiple colons - just so we can later see how the error message changes from a token recognition error to a proper parsing exception once Ensure cluster string could be quoted #120355 lands. 
| } | ||
| 
               | 
          ||
| public void testRemoteLookupJoin() { | ||
| expectError("FROM remote:left | LOOKUP JOIN right ON field", "LOOKUP JOIN does not support remote cluster indices [remote:left]"); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should try and randomize the remote:left  as much as possible. In that position, we absolutely allow all kinds of stuff, including date math and wildcard patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more test case suggestion: comma-separated list of index patterns, some of which are remote. E.g. FROM local,remote:pattern*,... with and without quoting.
| return p -> new LookupJoin(source, p, right, joinFields); | ||
| return p -> { | ||
| p.forEachUp(UnresolvedRelation.class, r -> { | ||
| if (RemoteClusterAware.isRemoteIndexName(r.table().index())) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think there is a bug here: in case of multiple indices separated by commas, isRemoteIndexName is not appropriate because it treats the string as a single index name.
I think commas cannot be used inside date math expressions, and should thus be a correct separator to use to split r.table().index() back into parts. (Failing this, we'd have to go back to the step where the individual index patterns are parsed - but I don't think it should be necessary, given that we also use r.table().index() for the field caps calls.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, can we have : in date math expression? Otherwise presents of : in concatenated string should indicate remote was used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you can have : in date math. And our findings also were that commas are not used in date math, so you need to split by commas and then call isRemoteIndexName.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a buggy edge case:
FROM idx,"<idx-{now/d{yyyy.MM.dd|+18:00}}>" | LOOKUP JOIN right ON field
This does not use CCQ and should thus not cause a parsing exception. It fails due to us not splitting the individual index patterns before checking for CCQ patterns.
# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/LogicalPlanBuilder.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/StatementParserTests.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Ievgen! I think the randomIndexPattern helper starts to really shine in this one.
I have only optional, minor remarks that I think you should just address at your own discretion. Thanks for the iterations!
| var basePattern = randomIndexPatterns(without(CROSS_CLUSTER)); | ||
| var joinPattern = randomIndexPattern(without(WILDCARD_PATTERN), without(CROSS_CLUSTER)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: this looks nice.
| // remote cluster on the right | ||
| var joinPattern = randomIndexPattern(CROSS_CLUSTER, without(WILDCARD_PATTERN)); | ||
| expectError( | ||
| "FROM " + randomIndexPatterns() + " | LOOKUP JOIN " + joinPattern + " ON " + randomIdentifier(), | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: due to not excluding CCS in the FROM here and above, the test is actually slightly overconstrained, as an error message based on the usage of CCS in the FROM would also be correct.
That is to say, this test case passes because our implementation performs checks in a certain order, but ordered the other way around would also be correct.
| } | ||
| { | ||
| // remote cluster on the left | ||
| var fromPatterns = randomIndexPatterns(CROSS_CLUSTER, without(WILDCARD_PATTERN)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this would be more complete if the fromPatterns wouldn't exclude wildcard patterns. They should not affect the validation.
          💚 Backport successful
  | 
    
It is not possible to perform lookup joins with CC at the moment.
Closes: ES-10552