Skip to content

Conversation

@bpintea
Copy link
Contributor

@bpintea bpintea commented Aug 4, 2025

Add optimisation rule to pull OrderBy above InlineJoin. This is required in queries since otherwise the sort won't be moved out of the left hand-side of the join, ending up stand-alone node that can't be turned into an executable -- it'll be rejected by verification already as an "unbounded sort". Since the InlineJoin is sort agnostic, the OrderBy can be moved upwards past it, ending up as a top TopN.

Related #124715 (#113727)

@elasticsearchmachine
Copy link
Collaborator

Hi @bpintea, I've created a changelog YAML for you.

;

// TODO: fails with: java.lang.AssertionError: expected no concrete indices without data node plan
sortBeforeInlinestats3-Ignore
Copy link
Contributor Author

@bpintea bpintea Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not PR specific, the following equivalent (SORT "manually" pulled up) fails in main:

FROM employees
| SORT languages DESC
| EVAL salaryK = salary/1000
| INLINESTATS count = COUNT(*) BY salaryK
| INLINESTATS min = MIN(MV_COUNT(languages)) BY salaryK
| SORT emp_no
| KEEP emp_no, still_hired, count
| LIMIT 5

Will follow up separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix in #132992 (will enable once that's merged).

@bpintea bpintea marked this pull request as ready for review August 12, 2025 15:32
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 12, 2025
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Left some minor remarks only.

10003 |61805.0 |4 |M
10004 |46272.5 |5 |M
10005 |63528.0 |1 |M
emp_no:integer| avg:double |languages:integer|gender:keyword
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

);
}

/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if these tests wouldn't be better placed in a separate Tests class under org.elasticsearch.xpack.esql.optimizer.rules.logical package. We are trying hard to keep this Tests file smaller. Since these tests in this PR are strictly related to a new rule, do you think it makes sense to have them in a separate file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll break this class (and others) in a few more, in a dedicated PR to follow.

Comment on lines +6356 to +6365
* EsqlProject[[emp_no{f}#11, avg{r}#5, languages{f}#14, gender{f}#13]]
* \_TopN[[Order[emp_no{f}#11,ASC,LAST]],5[INTEGER]]
* \_InlineJoin[LEFT,[languages{f}#14],[languages{f}#14],[languages{r}#14]]
* |_EsRelation[test][_meta_field{f}#17, emp_no{f}#11, first_name{f}#12, ..]
* \_Project[[avg{r}#5, languages{f}#14]]
* \_Eval[[$$SUM$avg$0{r$}#22 / $$COUNT$avg$1{r$}#23 AS avg#5]]
* \_Aggregate[[languages{f}#14],[SUM(salary{f}#16,true[BOOLEAN]) AS $$SUM$avg$0#22, COUNT(salary{f}#16,true[BOOLEAN]) AS
* $$COUNT$avg$1#23, languages{f}#14]]
* \_StubRelation[[_meta_field{f}#17, emp_no{f}#11, first_name{f}#12, gender{f}#13, hire_date{f}#18, job{f}#19, job.raw{f}#20,
* languages{f}#14, last_name{f}#15, long_noidx{f}#21, salary{f}#16]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we're taking the right approach here.

For LOOKUP JOIN, we push down limits via duplication, rather than pulling up order by.

Pushing down limits has the advantage that a lot fewer rows need to be joined on. Without pushing down, the limit becomes less meaningful, because it turns into a "run the query on the whole data, and then throw most of it away" at the last moment.

The problem is, of course, that StubRelation stands for the inline join's whole left hand side, and this prevents duplicating limits past the inline join as it would also include the limit in the computed stats.

I think this can be solved by making it more explicit which logical plan the stubrelation stands for, actually. We could add a StubRelationTarget plan node that identifies the target of the stub relation via an id. Then we could have

     * EsqlProject[[emp_no{f}#11, avg{r}#5, languages{f}#14, gender{f}#13]]
     * \_Limit[5]
     *   \_InlineJoin[LEFT,[languages{f}#14],[languages{f}#14],[languages{r}#14]]
     *     |_TopN[[Order[emp_no{f}#11,ASC,LAST]],5[INTEGER]]
     *     |  \_StubRelationTarget[#1]
     *     |    \_EsRelation[test][_meta_field{f}#17, emp_no{f}#11, first_name{f}#12, ..]
     *     \_Project[[avg{r}#5, languages{f}#14]]
     *       \_Eval[[$$SUM$avg$0{r$}#22 / $$COUNT$avg$1{r$}#23 AS avg#5]]
     *         \_Aggregate[[languages{f}#14],[SUM(salary{f}#16,true[BOOLEAN]) AS $$SUM$avg$0#22, COUNT(salary{f}#16,true[BOOLEAN]) AS
     *              $$COUNT$avg$1#23, languages{f}#14]]
     *           \_StubRelation[target=#1][[_meta_field{f}#17, emp_no{f}#11, first_name{f}#12, gender{f}#13, hire_date{f}#18, job{f}#19, job.raw{f}#20,
     *                  languages{f}#14, last_name{f}#15, long_noidx{f}#21, salary{f}#16]]

In particular, the stub relation target can be ignored during optimization of the second phase of the inline stats, pushing the topn 5 all the way to Lucene.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another approach is to let go of the StubRelation and explicitly replace it by the plan that the StubRelation is standing in for. That would be "honest" in the sense that we admit that the left and right hand sides of the joins need different optimizations and can be computed separately, sometimes.

Implementation-wise, it means that the InlineJoin needs to be replaced by a simpler left join node that will be planned as hash join.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea. Thank you @alex-spies for digging into it.
I've created an issue for this, as a followup; it needs some thinking.
#133120

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more complete: for LOOKUP JOIN, we do two things:

  • duplicate limits past the join
  • prune redundant sorts

OrderBys are generally pushed down, not pulled up. I'm not sure if this will be trouble sometimes.

In general, I think we should be doing the same as LOOKUP JOIN if possible.

Copy link
Contributor Author

@bpintea bpintea Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a stub for parts of the InlineJoin might be the better approach, but it might come with some extra complexity. We'd need to change the execution model we have now, which this PR took as an invariant.
And it's not just the TopN that could benefit from an alternative planning, the other big win would be having the Filter pushed down (eventually, both on the left hand-side optimisation & execution only). But "currently" both Limit and Filter won't be pushed past an InlineJoin.

IMO, unless we change the execution plan for InlineJoin before considering the feature available, we'll have to have a solution that pulls the OrderBy up: since Limit will be neither pushed down, nor duplicated, the OrderBy will either be executed twice (theoretically), or (actually) just prevent valid user queries from being run.

So @alex-spies your remark for a better alternative is valid, but it doesn't target this PR, but rather how we execute InlineJoin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or (actually) just prevent valid user queries from being run.

Maybe this is acceptable for tech preview as long as the error message tells users they should not have a sort before inline stats.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed only the rule and glanced at the tests.

Long term we want to solve this differently - individually optimizing the left and right hand side of the join. Let's leave a comment to #133120.

For now I think this is fine as it enables more valid queries. Thanks @bpintea !


/**
* Pulls "up" an {@link OrderBy} node that is not preceded by a not {@link SortAgnostic} node (such as {@link Limit}), but is preceded by an
* {@link InlineJoin}. The InlineJoin is {@link SortAgnostic}, so the OrderBy can be pulled up without affecting the semantics of the join.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is only true if the join has no duplicate matches. Which is true for INLINE STATS, but not generally true for InlineJoin. I think it's fine, but I'd make a big fat disclaimer on the rule.

Our sorts are not stable, so pulling up the orderby can indeed change the output. For LOOKUP JOIN we don't give guarantees about the order of the output, at all, but when we were discussing the corresponding problem for LOOKUP JOIN, the stability of the sort was the reason why we discarded pulling up orderbys (and pushing down limits made this less necessary).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is true for INLINE STATS, but not generally true for InlineJoin.

The Join (which InlineJoin extends) implements itself SortAgnostic. If that's generally not true, we should correct that? CC @luigidellaquila

Our sorts are not stable, so pulling up the orderby can indeed change the output.

I guess that's true in conjunction with a limit (so much more an implicit one). But that's taken into account by the rules we have.
Is there a case where this is true "unbounded"? I think the instability just propagates, the output might change (within the sort's contract), but irrespective of the position of the sort (close to the source or to the output)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, SortAgnostic's meaning is:

If there's a SORT after me, separated at most by other sort agnostic nodes, then I can remove a SORT before me (again separated at most by other sort agnostic nodes).

This is not what it's being used for, here, which is actually "If I get sorted input, it's okay to instead sort it later - if I'm only separated from the sort by other agnostic nodes" That's a strictly stronger statement, and any such node should automatically be SortAgnostic, but not the other way around.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, what we're doing here is actually changing the meaning of SortAgnostic, and we're closing the open question #121884 (comment). Because LOOKUP JOIN is sort agnostic, we're now choosing to jumble up a sorted lookup join's output (when it is between a sort and inline stats). Same for MV_EXPAND. It's already documented that we may do so! But here we finally take the decision to go this route.

I suggest that we make that explicit, and make the meaning of SortAgnostic: "I can be pushed down past sorts as long as I don't shadow/change any of the sort fields". We should consequently exclude Aggregate, OrderBy and TopN from SortAgnostic. This also solves the (incidental) reliance of this rule on a previous run of PruneRedundantOrderBy. I don't know if the meaning is still correct for InferencePlans, so we should check that, too.

The alternative, that would conserve the current semantics, would be to have 2 separate interfaces: SortAgnostic, unchanged, and additionally PushablePastOrderBy, which this rule really requires. Not sure it's worth maintaining both interfaces, though. In particular, it would mean that SORT | LOOKUP JOIN | INLINE STATS | WHERE remains non-executable (and same for MV_EXPAND).

@luigidellaquila , what do you think?

Copy link
Contributor Author

@bpintea bpintea Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the definition, @alex-spies.

If there's a SORT after me, separated at most by other sort agnostic nodes, then I can remove a SORT before me (again separated at most by other sort agnostic nodes).

This SortAgnostic contract is a bit hard to understand and respect, IMO. And the examples there could be amended: the definition you give above doesn't hold for Aggregate, for instance, as a preceding SORT can be dropped, irrespective of what follows it, be that other SORTs or SortAgnostic commands.

Then, SORT | COMMAND | SORT pattern needs not be analysed as such: IMO we're mostly interested if we can get to SORT | SORT, theoretically: it's in this pattern that we can say with certainty that the first SORT is redundant and what we're after is the ability to get to it; i.e. know if a COMMAND in between is agnostic to what SORT does, if its processing/output stays the same irrespective of how the input data is ordered.

Edit:

Limit is not sort agnostic because it's output changes, depending on how the data is sorted. I don't need to know what follows it (like another SORT).

Similarly STATS should probably not be agnostic - though for other reason - and we should just analyse the SORT | SORT_AGNOSTIC_COMMAND+ | STATS pattern and prune SORT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that we make that explicit, and make the meaning of SortAgnostic: "I can be pushed down past sorts as long as I don't shadow/change any of the sort fields". We should consequently exclude Aggregate, OrderBy and TopN from SortAgnostic.

I agree.

Copy link
Member

@fang-xing-esql fang-xing-esql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @bpintea!

Just by staring at the logical plan optimizer tests, one day if the filter can be pushed down into each side of the inline join, there could be a lot of performance improvements potentially, but that's out of the scope of this PR.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya, noticed a bug and wanted to raise an observation before moving forward - see my remarks below.

orderByHolder.set(orderBy);
breakEarly.set(true);
} else {
breakEarly.set(node instanceof SortAgnostic == false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya, since Aggregate is SortAgnostic, we won't break early in

SORT x | STATS avg(y) BY x | INLINESTATS count_distinct(z)

; if it wasn't for PruneRedundantOrderBy, we'd be creating unexecutable queries some times by pulling the SORT into a position where it might not be able to be executed, instead of simply pruning it.

This is not ideal, and at least we should add a comment both here and in the logical plan optimizer's rule list that this has to run after PruneRedundantOrderBy.


/**
* Pulls "up" an {@link OrderBy} node that is not preceded by a not {@link SortAgnostic} node (such as {@link Limit}), but is preceded by an
* {@link InlineJoin}. The InlineJoin is {@link SortAgnostic}, so the OrderBy can be pulled up without affecting the semantics of the join.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, SortAgnostic's meaning is:

If there's a SORT after me, separated at most by other sort agnostic nodes, then I can remove a SORT before me (again separated at most by other sort agnostic nodes).

This is not what it's being used for, here, which is actually "If I get sorted input, it's okay to instead sort it later - if I'm only separated from the sort by other agnostic nodes" That's a strictly stronger statement, and any such node should automatically be SortAgnostic, but not the other way around.

});
OrderBy orderBy = orderByHolder.get();
if (orderBy != null) {
return orderBy.replaceChild(inlineJoin.transformUp(OrderBy.class, ob -> ob == orderBy ? orderBy.child() : ob));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spotted a bug: we don't check if the attributes that we sort on are still present after pulling up.

from test | sort x | drop x | inlinestats z = count_distinct(y) | where z == 1

Found 1 problem\nline 1:13: Plan [TopN[[Order[x{f}#172,ASC,LAST]],1000[INTEGER]]] optimized incorrectly due to missing references [x{f}#172]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good (and "obvioius", [only] in hindsight) find. Thanks, @alex-spies!
Which emphasises that we might need more ample logic here, to keep the queries executable with the current model. Some examples:

FROM employees
| EVAL s1 = salary + 1
| SORT s1
| WHERE s1 > 50000 // this needs to be fed into INLINESTATS
| EVAL ls = languages::string
| DROP s1
| INLINESTATS cd = COUNT_DISTINCT(ls)
| WHERE emp_no < 10006
| KEEP emp_no, ls, cd

should become the equivalent of:

FROM employees
| EVAL s1 = salary + 1
| WHERE s1 > 50000
| EVAL ls = languages::string
| INLINESTATS cd = COUNT_DISTINCT(ls)
| SORT s1 // this sort needs to "push" the DROP along
| DROP s1
| WHERE emp_no < 10006
| KEEP emp_no, ls, cd

And similarly, with shadowing:

FROM employees
| EVAL s1 = salary + 1
| SORT s1
| WHERE s1 > 50000
| EVAL ls = languages::string
| EVAL s1 = salary // s1 is shadowed, not dropped
| INLINESTATS cd = COUNT_DISTINCT(ls)
| WHERE emp_no < 10006
| KEEP emp_no, ls, cd, s1

should become the equivalent of:

FROM employees
| EVAL s1 = salary + 1
| WHERE s1 > 50000
| EVAL ls = languages::string
| INLINESTATS cd = COUNT_DISTINCT(ls)
| SORT s1
| EVAL s1 = salary // this EVAL needs to be "pushed" by SORT along
| WHERE emp_no < 10006
| KEEP emp_no, ls, cd, s1

This might become a bit too involved, we might need to take an alternative approach, possibly following the proposal below.

At this point, we might want to add this sort of queries as a limitation, but how the user would need to rewrite the query for them to become executable might not be as obvious (see above). CC @astefan.


/**
* Pulls "up" an {@link OrderBy} node that is not preceded by a not {@link SortAgnostic} node (such as {@link Limit}), but is preceded by an
* {@link InlineJoin}. The InlineJoin is {@link SortAgnostic}, so the OrderBy can be pulled up without affecting the semantics of the join.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, what we're doing here is actually changing the meaning of SortAgnostic, and we're closing the open question #121884 (comment). Because LOOKUP JOIN is sort agnostic, we're now choosing to jumble up a sorted lookup join's output (when it is between a sort and inline stats). Same for MV_EXPAND. It's already documented that we may do so! But here we finally take the decision to go this route.

I suggest that we make that explicit, and make the meaning of SortAgnostic: "I can be pushed down past sorts as long as I don't shadow/change any of the sort fields". We should consequently exclude Aggregate, OrderBy and TopN from SortAgnostic. This also solves the (incidental) reliance of this rule on a previous run of PruneRedundantOrderBy. I don't know if the meaning is still correct for InferencePlans, so we should check that, too.

The alternative, that would conserve the current semantics, would be to have 2 separate interfaces: SortAgnostic, unchanged, and additionally PushablePastOrderBy, which this rule really requires. Not sure it's worth maintaining both interfaces, though. In particular, it would mean that SORT | LOOKUP JOIN | INLINE STATS | WHERE remains non-executable (and same for MV_EXPAND).

@luigidellaquila , what do you think?

@bpintea
Copy link
Contributor Author

bpintea commented Sep 9, 2025

#132417 (comment) shows that we should go a different route (#133120).

@bpintea bpintea closed this Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants