Skip to content

Conversation

julian-elastic
Copy link
Contributor

@julian-elastic julian-elastic commented Jul 18, 2025

Add support for Lookup Join on Multiple Fields

FROM index1
| LOOKUP JOIN lookup_index on field1, field2

Removed some checks to allow lookup join on multiple fields.
Added a new interface LookupEnrichQueryGenerator, that can be used to get total number of queries and queries by position. The rest of the methods from QueryGenerator are not needed by AbstractLookupService.

That allowed the creation of a new class ExpressionQueryList implements LookupEnrichQueryGenerator, which is responsible for creating the AND query for the different fields. We will likely need to enhance it in the future to support expressions that include OR and NOT as well.

TransportRequest is enhanced to now support List<MatchConfig> matchFields instead of String matchField. This is how we pass the match fields around now. If we are communicating with an cluster that does not support LookupOnMultipleFields and it is needed by the query we will fail the query. This can happen during rolling upgrade or CCS.

@julian-elastic julian-elastic added :Analytics/ES|QL AKA ESQL Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >enhancement labels Jul 18, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @julian-elastic, I've created a changelog YAML for you.

@julian-elastic julian-elastic self-assigned this Jul 21, 2025
@julian-elastic julian-elastic changed the title Lookup Join on Multiple Columns POC WIP Lookup Join on Multiple Fields Jul 23, 2025
Copy link
Contributor

github-actions bot commented Jul 23, 2025

🔍 Preview links for changed docs

@julian-elastic julian-elastic marked this pull request as ready for review July 23, 2025 20:22
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I now got everything except some of the compute engine-only code. Looks good, but I think we can clean up the changed (transport) request a little more as it sends redundant data now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to declare this change as notable. I'm not sure what the bar is for that @leemthompo ?

Copy link
Contributor

@leemthompo leemthompo Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you and @tylerperk agree sounds like this passes the notable bar 😄 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leemthompo How do I declare this change as notable? Can you point me to an example? Or you add it to some other release notes list after it is merged?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-spies by notable you mean adding the release highlight label I guess?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leemthompo I saw that we flag important release notes as notable, like here. Is this normally added via the release highlight label?

Both seem to make sense, so I'll go and mark this as release highlight and see what the bot does :)

Copy link
Contributor

@leemthompo leemthompo Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flag important release notes as notable

TIL 😄

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on my side. Left a few small things

Copy link
Contributor

@leemthompo leemthompo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs updates LGTM, thanks Julian

@julian-elastic julian-elastic marked this pull request as draft July 31, 2025 16:34
@julian-elastic julian-elastic marked this pull request as ready for review August 1, 2025 14:22
Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nice! Thanks for the iterations, I think this is a great PR @julian-elastic !

I have a final round of comments, of which I think the one about throwing the right exception should be addressed before merging but otherwise please proceed at your own discretion.

@elasticsearchmachine
Copy link
Collaborator

Hi @julian-elastic, I've updated the changelog YAML for you. Note that since this PR is labelled release highlight, you need to update the changelog YAML to fill out the extended information sections.

@julian-elastic julian-elastic merged commit a995a12 into elastic:main Aug 13, 2025
33 checks passed
@julian-elastic julian-elastic deleted the lookupJoin branch August 13, 2025 18:17
szybia added a commit to szybia/elasticsearch that referenced this pull request Aug 14, 2025
…-stats

* upstream/main: (36 commits)
  ESQL: Fix async operator warnings not always sent when blocking (elastic#132744)
  Method not needed anymore (elastic#132912)
  [Test] Excercise shutdown more reliably in snapshot stress IT (elastic#132909)
  Update Gradle shadow plugin to 9.0.1 (elastic#132637)
  Mute org.elasticsearch.test.rest.yaml.CcsCommonYamlTestSuiteIT test {p0=search/410_named_queries/named_queries_with_score} elastic#132906
  Update docker.elastic.co/wolfi/chainguard-base-fips:latest Docker digest to fa6cb69 (elastic#132735)
  Remove unnecessary calls to fold()  (elastic#131870)
  Use consistent terminology for transport version resources/references (elastic#132882)
  Mute org.elasticsearch.test.rest.yaml.CcsCommonYamlTestSuiteIT test {p0=search.vectors/40_knn_search_cosine/kNN search only regular query} elastic#132890
  Finalize release notes for v9.1.2 release (elastic#132745)
  Finalize release notes for v9.0.5 release (elastic#132718)
  Move inner records out of TransportVersionUtils (elastic#132872)
  Add support for Lookup Join on Multiple Fields (elastic#131559)
  Bootstrap PR-based benchmarks (elastic#132717)
  Refactor MetadataIndexTemplateService to use template maps instead of project metadata (elastic#132662)
  [Gradle] Update nebula ospackage plugin to 12.1.0 (elastic#132640)
  Mute org.elasticsearch.xpack.esql.CsvTests test {csv-spec:ip.CdirMatchEqualsInsOrs} elastic#132860
  Mute org.elasticsearch.xpack.esql.CsvTests test {csv-spec:floats.InMultivalue} elastic#132859
  Revert "Reuse prod code and reduce EsqlSession public surface" (elastic#132843)
  Mute org.elasticsearch.xpack.esql.CsvTests test {csv-spec:string.LengthOfText} elastic#132857
  ...
szybia added a commit to szybia/elasticsearch that referenced this pull request Aug 15, 2025
* upstream/main: (278 commits)
  ESQL - dense vector support cosine normalization (elastic#132721)
  [ML] Add support for dimensions in google vertex ai request (elastic#132689)
  ESQL - Add  byte element support for dense_vector data type (elastic#131863)
  ESQL: Fix async operator warnings not always sent when blocking (elastic#132744)
  Method not needed anymore (elastic#132912)
  [Test] Excercise shutdown more reliably in snapshot stress IT (elastic#132909)
  Update Gradle shadow plugin to 9.0.1 (elastic#132637)
  Mute org.elasticsearch.test.rest.yaml.CcsCommonYamlTestSuiteIT test {p0=search/410_named_queries/named_queries_with_score} elastic#132906
  Update docker.elastic.co/wolfi/chainguard-base-fips:latest Docker digest to fa6cb69 (elastic#132735)
  Remove unnecessary calls to fold()  (elastic#131870)
  Use consistent terminology for transport version resources/references (elastic#132882)
  Mute org.elasticsearch.test.rest.yaml.CcsCommonYamlTestSuiteIT test {p0=search.vectors/40_knn_search_cosine/kNN search only regular query} elastic#132890
  Finalize release notes for v9.1.2 release (elastic#132745)
  Finalize release notes for v9.0.5 release (elastic#132718)
  Move inner records out of TransportVersionUtils (elastic#132872)
  Add support for Lookup Join on Multiple Fields (elastic#131559)
  Bootstrap PR-based benchmarks (elastic#132717)
  Refactor MetadataIndexTemplateService to use template maps instead of project metadata (elastic#132662)
  [Gradle] Update nebula ospackage plugin to 12.1.0 (elastic#132640)
  Mute org.elasticsearch.xpack.esql.CsvTests test {csv-spec:ip.CdirMatchEqualsInsOrs} elastic#132860
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement release highlight Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants