Skip to content

Conversation

bpintea
Copy link
Contributor

@bpintea bpintea commented Sep 25, 2025

Currently, in case the join key is (locally) null, the whole join is replaced with an Eval and a Project. The project however contain only reference attributes. This is insufficient, in case the attribute should point to another attribute pointing to a literal null added in the Eval.

This fixes that by allowing the generated Project to contain NamedExpressions, which can then be proper Aliases.

Currently, in case (locally) the join key is null, the whole join is
replaced with an Eval and a Project. The project however contain only
reference attributes. This is insufficient, in case the attribute should
point to another attribute pointing to a literal null added in the Eval.

This fixes that by allowing the generated Project to contain
NamedExpressions, which can then be proper Aliases.
@bpintea bpintea added >bug auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v9.2.0 v8.20.0 labels Sep 25, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @bpintea, I've created a changelog YAML for you.

@bpintea bpintea marked this pull request as ready for review September 25, 2025 17:01
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Sep 25, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

bpintea added a commit to bpintea/elasticsearch that referenced this pull request Sep 25, 2025
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
For someone like me (who takes a PR for a spin with a sample query and dissects its internals and results) I would like to also see a concrete query in csv-spec file. You shared one with me offline and I think it wouldn't hurt to also have that one bundled in the PR, as well.
from languages_lookup_non_unique_key | eval language_code = null::integer | lookup join languages_lookup_non_unique_key on language_code

@astefan
Copy link
Contributor

astefan commented Sep 26, 2025

Actually... (sorry about that), I had an idea for a test and it turns out we don't have a test for that and with this PR it has a wrong behavior:
from languages_lookup_non_unique_key | keep country, language_name | eval language_code = null::integer | inline stats max(language_code) by language_code

Before this PR, the result is:

        country         | language_name |max(language_code)| language_code 
------------------------+---------------+------------------+---------------
Canada                  |English        |null              |null           
null                    |English        |null              |null           
United Kingdom          |null           |null              |null           
...

with this PR, the order of the columns changes (and it's wrong):

        country         | language_name | language_code |max(language_code)
------------------------+---------------+---------------+------------------
Canada                  |English        |null           |null              
null                    |English        |null           |null              
United Kingdom          |null           |null           |null              
...

@bpintea bpintea added the v9.2.1 label Oct 6, 2025
@bpintea
Copy link
Contributor Author

bpintea commented Oct 7, 2025

I had an idea for a test and it turns out we don't have a test for that and with this PR it has a wrong behavior

Indeed, due to the updated method overriding, the InlineJoin ended up using the wrong method. Thanks for catching it, @astefan! I've updated that and added in the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes make insure that the country field is part of the languages_lookup_non_unique_key schema. Otherwise the CSV tests generate an index resolution without it, which leads to failed queries when using the index in FROM.
The index was previously using similar languages schema, which lacks the fields, but which was added in dynamically by the data. Also, the index had only been used in LOOKUP JOIN before.
The change then also leads to updated order of the MV value ["Germany", "Austria"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.6 v9.2.1 v9.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants