Skip to content

Conversation

@bpintea
Copy link
Contributor

@bpintea bpintea commented Aug 27, 2025

...as this reduces the data the (inline) aggs run on, which is wrong.

Closes #133235

...as this reduces the data the (inline) aggs run on, which is wrong.
@bpintea bpintea requested review from alex-spies and astefan and removed request for astefan August 28, 2025 07:43
@bpintea bpintea marked this pull request as ready for review August 28, 2025 07:43
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 28, 2025
Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @bpintea !

}
}
} else if (limit.child() instanceof Join join && join.config().type() == JoinTypes.LEFT) {
} else if (limit.child() instanceof Join join && join.config().type() == JoinTypes.LEFT && join instanceof InlineJoin == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++.

Maybe let's add a comment that we want to also apply this to InlineJoin, but that requires that the stub relation corresponds to a node somewhere inside the left branch, rather than the whole left branch. Or we need to get rid of the stub relation then and replace it by a copy of the left branch (with updated name ids, most likely, but that's a detail).

FROM employees
| KEEP emp_no, languages, gender
| INLINESTATS max_lang = MAX(languages) BY gender
| LIMIT 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limit 1 is non-deterministic and this might fail on multi-node tests like for serverless, no?

Maybe LIMIT 3 | SORT gender (-> F, M, null) as this would still surface the bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first (emp_no-sorted) value has a language value of 2. If the LIMIT (of 1) is pushed down, that would yield a max_lang of 2. Ideally we'd have a | SORT emp_no before INLINESTATS, but that doesn't current work.
In any case, even if the data is shuffled, all records should have the same max_lang (5).
I've left a comment on the test.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}
}
} else if (limit.child() instanceof Join join && join.config().type() == JoinTypes.LEFT) {
} else if (limit.child() instanceof Join join && join.config().type() == JoinTypes.LEFT && join instanceof InlineJoin == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A one line comment here would help, imho.

@bpintea
Copy link
Contributor Author

bpintea commented Aug 29, 2025

Thanks Alex, Andrei.

@bpintea bpintea added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Aug 29, 2025
@elasticsearchmachine elasticsearchmachine merged commit 95c220f into elastic:main Aug 29, 2025
33 checks passed
@bpintea bpintea deleted the fix/no_limit_copy_on_inlinejoin branch August 29, 2025 14:40
JeremyDahlgren pushed a commit to JeremyDahlgren/elasticsearch that referenced this pull request Aug 29, 2025
...as this reduces the data the (inline) aggs run on, which is wrong.

Closes elastic#133235
astefan added a commit to astefan/elasticsearch that referenced this pull request Sep 2, 2025
astefan added a commit to astefan/elasticsearch that referenced this pull request Sep 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ESQL: INLINESTATS wrongly applies the after-LIMIT command to the right hand side agg

4 participants