You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Keep track of reverse-cardinality in Joins for optimizations (#413)
Modifying the cardinality set up for relational join nodes to have a notion of "reverse cardinality", e.g. what is the cardinality from the perspective of the RHS input with regards to the LHS input. For example, when the left hand side is the join is the tpch CUSTOMER table and the right hand side is the tpch ORDERS table, the cardinality is `PLURAL_FILTER` (since each customer can have 0, 1, or multiple matching orders) but the reverse cardinality is `SINGULAR_ACCESS` (since each order has exactly 1 matching customer). This reverse cardinality can be used for two things right away:
- Adjusting the partial aggregation splitting protocol to infer when to push / not push an aggregate into the _right_ hand side based on the reverse cardinality (e.g. if the reverse cardinality is filtering, don't push because the join will actually reduce the number of rows).
- Modifying the column pruning protocol for joins, which currently removes the RHS for certain kinds of joins if the RHS columns are unused and the cardinality is `SINGULAR_ACCESS`: can now dot he same to prune the LHS entirely if the reverse cardinality is `SINGULAR_ACCESS` (e.g. in the `CUSTOMER` -> `ORDERS` example, if it is an inner join and every column in `CUSTOMER` is unused, we can just prune the `CUSTOMER` side of the join entirely)
0 commit comments