Skip to content
Merged
5 changes: 5 additions & 0 deletions docs/changelog/120355.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 120355
summary: Ensure cluster string could be quoted
area: ES|QL
type: enhancement
issues: []
1 change: 1 addition & 0 deletions x-pack/plugin/esql/src/main/antlr/EsqlBaseParser.g4
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ indexPattern

clusterString
: UNQUOTED_SOURCE
| QUOTED_STRING
;
Comment on lines 144 to 147
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since they are the same, point clusterString to indexString:

clusterString
  : indexString
;

We could fully remove it but it's worth keeping the element in for future changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to avoid that. clusterString should not be equivalent to indexString.
For example clusterString should not allow : and :: from upcoming selector changes.
I imagine some day we might need to reflect that in grammar.


indexString
Expand Down

Large diffs are not rendered by default.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import org.elasticsearch.cluster.metadata.MetadataCreateIndexService;
import org.elasticsearch.common.Strings;
import org.elasticsearch.indices.InvalidIndexNameException;
import org.elasticsearch.transport.RemoteClusterService;
import org.elasticsearch.xpack.esql.core.util.Holder;
import org.elasticsearch.xpack.esql.parser.EsqlBaseParser.IdentifierContext;
import org.elasticsearch.xpack.esql.parser.EsqlBaseParser.IndexStringContext;
Expand Down Expand Up @@ -51,29 +52,51 @@ protected static String quoteIdString(String unquotedString) {
return "`" + unquotedString.replace("`", "``") + "`";
}

@Override
public String visitClusterString(EsqlBaseParser.ClusterStringContext ctx) {
if (ctx == null) {
return null;
} else if (ctx.UNQUOTED_SOURCE() != null) {
return ctx.UNQUOTED_SOURCE().getText();
} else {
return unquote(ctx.QUOTED_STRING().getText());
}
Comment on lines +57 to +63
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment in the grammar - this method can then be either remove or delegate to visitIndexString.

}

@Override
public String visitIndexString(IndexStringContext ctx) {
TerminalNode n = ctx.UNQUOTED_SOURCE();
return n != null ? n.getText() : unquote(ctx.QUOTED_STRING().getText());
if (ctx.UNQUOTED_SOURCE() != null) {
return ctx.UNQUOTED_SOURCE().getText();
} else {
return unquote(ctx.QUOTED_STRING().getText());
}
}

public String visitIndexPattern(List<EsqlBaseParser.IndexPatternContext> ctx) {
List<String> patterns = new ArrayList<>(ctx.size());
Holder<Boolean> hasSeenStar = new Holder<>(false);
ctx.forEach(c -> {
String indexPattern = visitIndexString(c.indexString());
String clusterString = c.clusterString() != null ? c.clusterString().getText() : null;
String clusterString = visitClusterString(c.clusterString());
// skip validating index on remote cluster, because the behavior of remote cluster is not consistent with local cluster
// For example, invalid#index is an invalid index name, however FROM *:invalid#index does not return an error
if (clusterString == null) {
hasSeenStar.set(indexPattern.contains(WILDCARD) || hasSeenStar.get());
validateIndexPattern(indexPattern, c, hasSeenStar.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, that's slightly out of scope but I just realized that the validation for the index pattern is lacking. For instance, you can use FROM "foo:bar:baz" and you'll not even get an error message.

What is a bit more in scope: at least when the cluster string is not null, we should probably validate that the index pattern is not a remote pattern. This applies to cases like FROM "remote":"index:pattern".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried locally, with your branch:

$ curl -u elastic:password -H "Content-Type: application/json" "127.0.0.1:9200/_query?format=txt" -d '
{
  "query": "from cluster_one:\"remote:index\""
}'
  <no-fields>  
---------------

$ curl -u elastic:password -H "Content-Type: application/json" "127.0.0.1:9200/_query?format=txt" -d '
{
  "query": "from cluster_one:remote:index"   
}'
{"error":{"root_cause":[{"type":"parsing_exception","reason":"line 1:24: mismatched input ':' expecting {<EOF>, '|', ',', 'metadata'}"}],"type":"parsing_exception","reason":"line 1:24: mismatched input ':' expecting {<EOF>, '|', ',', 'metadata'}","caused_by":{"type":"input_mismatch_exception","reason":null}},"status":400}%

$ curl -u elastic:password -H "Content-Type: application/json" "127.0.0.1:9200/_query?format=txt" -d '
{
  "query": "from \"cluster_one:remote:index\""
}'
  <no-fields>  
---------------

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@idegtiarenko if you feel like it, I think we could add some validation for this as we're just touching this anyway. Otherwise, let's put what we found into an issue because that'll need to be fixed one day, anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this a bit with @quux00 and @idegtiarenko . This problem pre-existed:

  • non-null clusterString with invalid indexString in remote:"invalid:index"
  • generally invalid indexString in "invalid:remote:index"

More generally, CCQ validation has larger scope as it also needs to take into account licensing and security.

Thus, this needs a follow-up that does this properly. I opened this issue: #121901

} else {
validateClusterString(clusterString, c);
}
patterns.add(clusterString != null ? clusterString + REMOTE_CLUSTER_INDEX_SEPARATOR + indexPattern : indexPattern);
});
return Strings.collectionToDelimitedString(patterns, ",");
}

protected static void validateClusterString(String clusterString, EsqlBaseParser.IndexPatternContext ctx) {
if (clusterString.indexOf(RemoteClusterService.REMOTE_CLUSTER_INDEX_SEPARATOR) != -1) {
throw new ParsingException(source(ctx), "cluster string [{}] must not contain ':'", clusterString);
}
}

private static void validateIndexPattern(String indexPattern, EsqlBaseParser.IndexPatternContext ctx, boolean hasSeenStar) {
// multiple index names can be in the same double quote, e.g. indexPattern = "idx1, *, -idx2"
String[] indices = indexPattern.split(",");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -300,18 +300,18 @@ public void testStatsWithoutGroups() {
);
}

public void testStatsWithoutAggs() throws Exception {
public void testStatsWithoutAggs() {
assertEquals(
new Aggregate(EMPTY, PROCESSING_CMD_INPUT, Aggregate.AggregateType.STANDARD, List.of(attribute("a")), List.of(attribute("a"))),
processingCommand("stats by a")
);
}

public void testStatsWithoutAggsOrGroup() throws Exception {
public void testStatsWithoutAggsOrGroup() {
expectError("from text | stats", "At least one aggregation or grouping expression required in [stats]");
}

public void testAggsWithGroupKeyAsAgg() throws Exception {
public void testAggsWithGroupKeyAsAgg() {
var queries = new String[] { """
row a = 1, b = 2
| stats a by a
Expand All @@ -332,7 +332,7 @@ public void testAggsWithGroupKeyAsAgg() throws Exception {
}
}

public void testStatsWithGroupKeyAndAggFilter() throws Exception {
public void testStatsWithGroupKeyAndAggFilter() {
var a = attribute("a");
var f = new UnresolvedFunction(EMPTY, "min", DEFAULT, List.of(a));
var filter = new Alias(EMPTY, "min(a) where a > 1", new FilteredExpression(EMPTY, f, new GreaterThan(EMPTY, a, integer(1))));
Expand All @@ -342,7 +342,7 @@ public void testStatsWithGroupKeyAndAggFilter() throws Exception {
);
}

public void testStatsWithGroupKeyAndMixedAggAndFilter() throws Exception {
public void testStatsWithGroupKeyAndMixedAggAndFilter() {
var a = attribute("a");
var min = new UnresolvedFunction(EMPTY, "min", DEFAULT, List.of(a));
var max = new UnresolvedFunction(EMPTY, "max", DEFAULT, List.of(a));
Expand Down Expand Up @@ -377,7 +377,7 @@ public void testStatsWithGroupKeyAndMixedAggAndFilter() throws Exception {
);
}

public void testStatsWithoutGroupKeyMixedAggAndFilter() throws Exception {
public void testStatsWithoutGroupKeyMixedAggAndFilter() {
var a = attribute("a");
var f = new UnresolvedFunction(EMPTY, "min", DEFAULT, List.of(a));
var filter = new Alias(EMPTY, "min(a) where a > 1", new FilteredExpression(EMPTY, f, new GreaterThan(EMPTY, a, integer(1))));
Expand Down Expand Up @@ -615,6 +615,60 @@ private void clustersAndIndices(String command, String indexString1, String inde
);
}

public void testValidFromIndexPattern() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has some overlap with testStringAsIndexPattern in the same file.

I think this test has the following drawbacks:

  • Only one kind of quoting (", but there's also still """).
  • None of the index patterns actually require quoting. Looking at testStringAsIndexPattern, there's e.g. date math stuff that can make quoting required.

var patterns = randomList(1, 5, () -> {
String pattern = randomIndexIdentifier();// index or alias
if (randomBoolean()) {// pattern
pattern += "*";
}
if (randomBoolean()) {// quoted
pattern = "\"" + pattern + "\"";
}
if (randomBoolean()) {// remote cluster
var cluster = randomIdentifier();
if (randomBoolean()) {// quoted
cluster = "\"" + cluster + "\"";
}
pattern = cluster + ":" + pattern;
}
if (pattern.contains(":") && pattern.contains("\"") == false) {// quote entire "cluster:index"
pattern = "\"" + pattern + "\"";
}
return pattern;
});

var plan = statement("FROM " + String.join(",", patterns));
var expected = String.join(",", patterns).replace("\"", "");

assertThat(plan, instanceOf(UnresolvedRelation.class));
assertThat(((UnresolvedRelation) plan).table().index(), equalTo(expected));
}

private static String randomIndexIdentifier() {
// https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params
var validFirstCharacters = "abcdefghijklmnopqrstuvwxyz0123456789!'$^&";
var validCharacters = validFirstCharacters + "+-_.";

var index = new StringBuilder();
if (randomInt(9) == 0) {// hidden index
index.append('.');
}
index.append(randomCharacterFrom(validFirstCharacters));
for (int i = 0; i < randomIntBetween(1, 100); i++) {
index.append(randomCharacterFrom(validCharacters));
}
return index.toString();
}

private static char randomCharacterFrom(String str) {
return str.charAt(randomInt(str.length() - 1));
}

public void testInvalidFromIndexPattern() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related to testInvalidCharacterInIndexPattern in the same file.

expectError("FROM \"remote:\":index", "line 1:6: cluster string [remote:] must not contain ':'");
expectError("FROM \"remote:invalid\":index", "line 1:6: cluster string [remote:invalid] must not contain ':'");
}

public void testInvalidQuotingAsFromIndexPattern() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should also add tests for invalid quoting of the remote name itself for good measure.

expectError("FROM \"foo", ": token recognition error at: '\"foo'");
expectError("FROM \"foo | LIMIT 1", ": token recognition error at: '\"foo | LIMIT 1'");
Expand Down Expand Up @@ -2060,41 +2114,41 @@ private void assertStringAsLookupIndexPattern(String string, String statement) {
assertThat(tableName.fold(FoldContext.small()), equalTo(string));
}

public void testIdPatternUnquoted() throws Exception {
public void testIdPatternUnquoted() {
var string = "regularString";
assertThat(breakIntoFragments(string), contains(string));
}

public void testIdPatternQuoted() throws Exception {
public void testIdPatternQuoted() {
var string = "`escaped string`";
assertThat(breakIntoFragments(string), contains(string));
}

public void testIdPatternQuotedWithDoubleBackticks() throws Exception {
public void testIdPatternQuotedWithDoubleBackticks() {
var string = "`escaped``string`";
assertThat(breakIntoFragments(string), contains(string));
}

public void testIdPatternUnquotedAndQuoted() throws Exception {
public void testIdPatternUnquotedAndQuoted() {
var string = "this`is`a`mix`of`ids`";
assertThat(breakIntoFragments(string), contains("this", "`is`", "a", "`mix`", "of", "`ids`"));
}

public void testIdPatternQuotedTraling() throws Exception {
public void testIdPatternQuotedTrailing() {
var string = "`foo`*";
assertThat(breakIntoFragments(string), contains("`foo`", "*"));
}

public void testIdPatternWithDoubleQuotedStrings() throws Exception {
public void testIdPatternWithDoubleQuotedStrings() {
var string = "`this``is`a`quoted `` string``with`backticks";
assertThat(breakIntoFragments(string), contains("`this``is`", "a", "`quoted `` string``with`", "backticks"));
}

public void testSpaceNotAllowedInIdPattern() throws Exception {
public void testSpaceNotAllowedInIdPattern() {
expectError("ROW a = 1| RENAME a AS this is `not okay`", "mismatched input 'is' expecting {<EOF>, '|', ',', '.'}");
}

public void testSpaceNotAllowedInIdPatternKeep() throws Exception {
public void testSpaceNotAllowedInIdPatternKeep() {
expectError("ROW a = 1, b = 1| KEEP a b", "extraneous input 'b'");
}

Expand Down