Skip to content

Conversation

@mcheshkov
Copy link
Contributor

@mcheshkov mcheshkov commented Jan 29, 2025

Check List

  • Tests have been run in packages where changes made if available
  • Linter has been run for changed code
  • Tests for the changes have been added if not covered yet
  • Docs have been added / updated if required

Description of Changes Made (if issue reference is not provided)

Add support for complex join conditions for grouped joins

DataFusion plans non-trivial joins (ones that are not l.column = r.column) as Filter(CrossJoin(...))
To support ungrouped-grouped joins with queries like this SQL API needs to rewrite logical plan like that to WrappedSelect with join inside. To do that it need to distinguish between plan coming from regular JOIN and actual CROSS JOIN with WHERE on top. This is done with new JoinCheckStage: it starts on Filter(CrossJoin(wrapper, wrapper)), traverses all ANDs in filter condition, checks that "leaves" in condition are comparing two join sides, and pulls up that fact. After that regular join rewrite can start on checked condition.

Supporting changes:

  • Allow grouped join sides to have different in_projection flag

  • Allow non-push_to_cube WrappedSelect in grouped subquery position in join

  • Make zero members wrapper more expensive than filter member

  • Replace alias to cube during wrapper pull up

  • Wrap is_null expressions in parens, to avoid operator precedence issues
    Expression like (foo IS NOT NULL = bar IS NOT NULL)`` would try to compare foo IS NOT NULLwithbar, not with bar IS NOT NULL`

@codecov
Copy link

codecov bot commented Jan 29, 2025

Codecov Report

Attention: Patch coverage is 97.98883% with 18 lines in your changes missing coverage. Please review.

Project coverage is 83.33%. Comparing base (9a73857) to head (4d2fa3f).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
.../cubesql/src/compile/rewrite/rules/wrapper/join.rs 97.60% 12 Missing ⚠️
...t/cubesql/cubesql/src/compile/engine/df/wrapper.rs 57.14% 3 Missing ⚠️
...c/compile/rewrite/rules/wrapper/wrapper_pull_up.rs 95.31% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9157      +/-   ##
==========================================
+ Coverage   83.17%   83.33%   +0.16%     
==========================================
  Files         226      226              
  Lines       80096    80966     +870     
==========================================
+ Hits        66620    67475     +855     
- Misses      13476    13491      +15     
Flag Coverage Δ
cubesql 83.33% <97.98%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mcheshkov mcheshkov marked this pull request as ready for review January 29, 2025 18:51
@mcheshkov mcheshkov requested review from a team as code owners January 29, 2025 18:51
@mcheshkov mcheshkov force-pushed the join-pushdown-complex-conditions branch from ae291e2 to db43682 Compare January 31, 2025 15:32
… issues

Expression like `(foo IS NOT NULL = bar IS NOT NULL)`` would try to compare `foo IS NOT NULL` with `bar`, not with `bar IS NOT NULL`
* Support COALESCE + IS NOT NULL join condition
* Support IS NOT DISTINCT join condition
* Support expression on top of columns, like CAST
@mcheshkov mcheshkov force-pushed the join-pushdown-complex-conditions branch from db43682 to 4d2fa3f Compare February 4, 2025 11:42
@mcheshkov mcheshkov merged commit 28c1e3b into master Feb 4, 2025
72 checks passed
@mcheshkov mcheshkov deleted the join-pushdown-complex-conditions branch February 4, 2025 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants