Skip to content

LOOKUP JOIN fails with cryptic error when we DROP the fields looked up #118778

@craigtaverner

Description

@craigtaverner

A query that uses KEEP (or DROP) to drop the fields that are looked up will fail with a runtime error during LOOKUP.

ROW left = "left", client_ip = "172.21.0.5", right = "right"
| LOOKUP JOIN clientips_lookup ON client_ip
| KEEP left, client_ip, right

Failed with:

path: /_query, params: {format=txt, error_trace=true}, status: 500 java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 0
        at org.elasticsearch.compute.operator.lookup.MergePositionsOperator.<init>(MergePositionsOperator.java:76)
        at org.elasticsearch.xpack.esql.enrich.AbstractLookupService.doLookup(AbstractLookupService.java:329)
...

The problem is that the LOOKUP JOIN should return the env field, and dropping it causes the error, since all fields left after the KEEP actually come from the left index.

The current behaviour differs between main and the latest code in #118429.

  • main uses fieldcaps * for lookup indices, and so will resolve to lookup env even if it is later dropped by the KEEP. This is wrong, since we do a lot of work which we then drop
  • https://github.com/elastic/elasticsearch/pull/118429 instead does fieldcaps on what it sees the query as needing, which does not include any fields on the right. But instead of generating a validation error or erasing the LOOKUP JOIN entirely, it passes the lookup no fields down to the runtime, and that chokes with the cryptic error.

Both are incorrect.

Choices of behaviour:

  • Throw a validation error for LOOKUP JOIN that has output only from the left index.
  • Rewrite away LOOKUP JOIN if the JOIN does not actually achieve anything.
  • Allow LOOKUP JOIN on no output fields, but return as many rows as match the join key, but return no extra columns

This last option appears to be favoured as we move in the direction of returning increased cardinality instead of multivalue results. See #118781

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions