Skip to content

ORCA: allow different strategy control the redistribute key below aggregate#1192

Merged
jiaqizho merged 1 commit intoapache:mainfrom
jiaqizho:orca-support-res-key-stragy
Aug 15, 2025
Merged

ORCA: allow different strategy control the redistribute key below aggregate#1192
jiaqizho merged 1 commit intoapache:mainfrom
jiaqizho:orca-support-res-key-stragy

Conversation

@jiaqizho
Copy link
Contributor

…regate

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


@jiaqizho jiaqizho force-pushed the orca-support-res-key-stragy branch from d5193a0 to 0e158c8 Compare June 27, 2025 01:35
@jiaqizho jiaqizho closed this Jun 27, 2025
@jiaqizho jiaqizho reopened this Jul 3, 2025
@jiaqizho jiaqizho force-pushed the orca-support-res-key-stragy branch from 0e158c8 to 43c05cf Compare July 3, 2025 03:29
@jiaqizho jiaqizho changed the title ORCA: allow different strategy control the redistribute key below agg… ORCA: allow different strategy control the redistribute key below aggregate Jul 3, 2025
@jiaqizho jiaqizho force-pushed the orca-support-res-key-stragy branch from 43c05cf to 185149c Compare August 6, 2025 06:49
@jiaqizho jiaqizho force-pushed the orca-support-res-key-stragy branch from 185149c to d1985bd Compare August 15, 2025 09:11
…regate

In CBDB, if there is an AGG operator (one-step AGG or final AGG operator) that
requires data redistribution, then the redistribution motion operator will used
all `GROUP BY` keys as the redistribute keys. In fact, only a single key needs
to be redistributed, and the results of AGG will be the same.

Reducing the number of redistributed keys can effectively reduce the overhead
of hash function calls in motion operator. However, this may lead to data skew.

Therefore, the current commit provides several different strategies for deciding
how redistribution keys should be selected during redistribution motion operator
(which under the AGG operator). User can use the GUC `optimizer_agg_pds_strategy`
to select the strategies.

- OPTIMIZER_AGG_PDS_ALL_KEY(value: 0): default one, select all `GROUP BY` key as
  the redistributed keys.
- OPTIMIZER_AGG_PDS_FIRST_KEY(value: 1): select the first `GROUP BY` key as the
  redistributed keys.
- OPTIMIZER_AGG_PDS_MINIMAL_LEN_KEY(value: 2): select a `GROUP BY` key which has
  the minimal and positive typlen as the redistributed keys. If only non-fixed
  type (such as text and varchar) exist, select the first `GROUP BY` key.
- OPTIMIZER_AGG_PDS_EXCLUDE_NON_FIXED(value: 3): select the `GROUP BY` key which
  is fixed typlen the redistributed keys. If only non-fixed type (such as text
  and varchar) exist, select the first `GROUP BY` key.
@jiaqizho jiaqizho force-pushed the orca-support-res-key-stragy branch from d1985bd to 44c35bf Compare August 15, 2025 09:43
Copy link
Contributor

@my-ship-it my-ship-it left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiaqizho jiaqizho merged commit 60eb10d into apache:main Aug 15, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants