Skip to content

Conversation

@ChrisHegarty
Copy link
Owner

@ChrisHegarty ChrisHegarty commented Feb 24, 2022

Prototype of Lightweight Logging API elastic#84478

ChrisHegarty pushed a commit that referenced this pull request Aug 9, 2023
Fixes elastic/elasticsearch-internal#497
Fixes ESQL-560

A query like `from test | sort data | limit 2 | project count` fails
because `LocalToGlobalLimitAndTopNExec` planning rule adds a collecting
`TopNExec` after last GATHER exchange, to perform last reduce, see

```
TopNExec[[Order[data{f}#6,ASC,LAST]],2[INTEGER]]
\_ExchangeExec[GATHER,SINGLE_DISTRIBUTION]
  \_ProjectExec[[count{f}#4]]      // <- `data` is projected away but still used by the TopN node above
    \_FieldExtractExec[count{f}#4]
      \_TopNExec[[Order[data{f}#6,ASC,LAST]],2[INTEGER]]
        \_FieldExtractExec[data{f}#6]
          \_ExchangeExec[REPARTITION,FIXED_ARBITRARY_DISTRIBUTION]
            \_EsQueryExec[test], query[][_doc_id{f}elastic#9, _segment_id{f}elastic#10, _shard_id{f}elastic#11]
```

Unfortunately, at that stage the inputs needed by the TopNExec could
have been projected away by a ProjectExec, so they could be no longer
available.

This PR adapts the plan as follows:
- add all the projections used by the `TopNExec` to the existing
`ProjectExec`, so that they are available when needed
- add another ProjectExec on top of the plan, to project away the
originally removed projections and preserve the query semantics


This approach is a bit dangerous, because it bypasses the mechanism of
input/output resolution and validation that happens on the logical plan.
The alternative would be to do this manipulation on the logical plan,
but it's probably hard to do, because there is no concept of Exchange at
that level.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants