-
Notifications
You must be signed in to change notification settings - Fork 197
ORCA: create windows hash aggregation physical operator when vectorization enabled #1258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4714e80 to
3c9d812
Compare
|
Is it better to add |
|
Is it better to add code below in explain.c? |
1 similar comment
|
Is it better to add code below in explain.c? |
3c9d812 to
8374618
Compare
no need, cause the the row executor won't get the |
8374618 to
4856e06
Compare
4856e06 to
3ad1cd4
Compare
my-ship-it
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general except one comment
3ad1cd4 to
2ac1f69
Compare
2ac1f69 to
1b1675c
Compare
…xector In this PR, ORCA now supports generating `WindowHashAgg` plans which already have implementation in the vectorization executor. However, the CBDB row executor currently lacks implementation for the WindowHashAgg operator. To prevent ORCA from generating this operator in the row executor, I've added an struct which named `OptimizerOptions` to control the plan for row executor or vectorization executor. (By the way, ORCA may later generate plans specifically for the vectorization executor). The `WindowAgg` operator implemention in the vectorization execution is: 1. First, sorting the input rows by `ORDER BY` keys 2. Then do the `PARTITION` by `PARTITION BY` keys 3. Finally do the window function. Since step1 must be globally sorted, it cannot be parallelized in the vectorization executor. This results in poor performance of the `WindowAgg` operator. By contrast, `WindowHashAgg` employs a more efficient approach: 1. First hashes input data into buckets based on `PARTITION BY` keys 2. Then sorts data `within each bucket` according to `ORDER BY` keys 3. Finally computes window functions on the sorted bucket data For the row engine, `WindowHashAgg` operators will not be generated. Also current commit introduces a new GUC named `optimizer_force_window_hash_agg` to force generate plans with `WindowHashAgg` (Don't used this GUC expect debug ORCA). Co-Author-By: zhangyue <[email protected]>
1b1675c to
e9a0f5f
Compare
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheckmake -C src/test installcheck-cbdb-parallelImpact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions