Skip to content

Conversation

@astefan
Copy link
Contributor

@astefan astefan commented Feb 11, 2025

First take at enabling inlinestats tests.

@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.1.0 labels Feb 11, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@astefan astefan requested a review from costin February 11, 2025 14:41
@elasticsearchmachine
Copy link
Collaborator

Hi @astefan, I've created a changelog YAML for you.


maxOfInt-Ignore
maxOfInt
required_capability: join_planning_v1
Copy link
Contributor

@alex-spies alex-spies Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we'll need a new capability to avoid running this with old nodes on bwc tests.

FROM airports
| RENAME scalerank AS int
| LOOKUP int_number_names ON int
| LOOKUP JOIN int_number_names ON int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is incorrect, this test is still disabled and used to use LOOKUP_ 🐔 in the past. (But we probably also want a version of this with LOOKUP JOIN.)

Comment on lines +182 to +184
if (join instanceof InlineJoin) {
return new FragmentExec(bp);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is odd. What do we need this for? If this is a quick hack, can we add a TODO comment?

I'd think this would prevent the if (right instanceof LocalSourceExec localData) { branch below from ever triggering, which used to be the branch that mapped a physical operation to INLINESTATS.

It seems like this, instead, pushes the execution of the inline join into the data nodes, which cannot always be true - e.g. there could be a regular STATS before the INLINESTATS, which requries execution on the data node. Or just a LIMIT/SORT, or any other pipeline breaker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the inlinestats has been rewritten to allow lookup join to work in a more correct way, what was left for inlinestats was an incomplete solution (inlinestats.csv-spec was ignored in its entirety). EsqlSession code flow expects a fragment for inlinestats that contains logical plans for its right and left branches. It's planning then the right branch, executes it, then it comes back to its left branch and uses the right results as constant blocks in the left logical plan tree.

There are still many queries muted in the inlinestats.csv-spec test and I don't want to go the rabbit hole and try to fix everything in one go. I want on purpose to limit the extent of this PR and I will look further into fixing the rest in followups. If the followups lead to changing the fragment approach, that's ok.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is much smaller than I expected. LGTM

* Fixes a series of issues with inlinestats which had an incomplete implementation after lookup and inlinestats
* were refactored.
*/
INLINESTATS_STUBRELATION_PROPER_REPLACEMENT;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe pick a smaller name that we can version while iterating on the implementation, for example: INLINESTATS_V2 ?

@astefan astefan merged commit 4b3acd4 into elastic:main Feb 12, 2025
17 checks passed
astefan added a commit to astefan/elasticsearch that referenced this pull request Feb 13, 2025
(cherry picked from commit 4b3acd4)

Don't run the csv tests for inlinestats in non-snapshot tests (elastic#122407)
(cherry picked from commit f8fb3c3)
@astefan astefan added the v9.0.1 label Feb 13, 2025
elasticsearchmachine pushed a commit that referenced this pull request Feb 13, 2025
(cherry picked from commit 4b3acd4)

Don't run the csv tests for inlinestats in non-snapshot tests (#122407)
(cherry picked from commit f8fb3c3)
@astefan astefan deleted the inlinestats_pickup1 branch February 13, 2025 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.0.1 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants