You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enable Projection Pushdown Optimization for Recursive CTEs (#16696)
* Add column pruning support for RecursiveQuery operator in optimizer
- Extend optimize_projections to handle LogicalPlan::RecursiveQuery by applying
projection pushdown to its inputs, improving query performance.
- Add integration test `recursive_query_column_pruning` verifying plan shows
correct projection pruning for recursive CTEs.
- Implement `create_cte_work_table` in test context provider to support CTE tests.
- Add .github/copilot-instructions.md and AGENTS.md docs with Rust coding,
linting, formatting, and contribution guidelines to maintain quality and
consistency in generated and user code.
* Amend cte results
* Add recursive_cte.rs integration test and improve recursive CTE projection pushdown in cte.slt
- Add a new async test `recursive_cte_alias_instability` covering a complex recursive CTE query to the DataFusion core SQL tests, validating recursive CTE alias handling and query stability.
- Enhance existing sqllogictest file `cte.slt` by adding column projections to recursive CTE TableScans, improving plan efficiency and consistency.
- Fix projection pushdown for recursive CTEs in logical and physical plans for nodes, balances, recursive_cte, and numbers test cases.
- This addresses alias instability and projection inefficiencies previously observed in recursive CTE handling and improves test coverage for recursive SQL features.
* Remove redundant handling of RecursiveQuery in optimize_projections function
* main cte.slt
* Add tests for recursive CTE projection pushdown scenarios
* Remove unused create_cte_work_table function from MyContextProvider implementation
* Refactor TableScan projections in recursive query tests for clarity
* consolidate recursive cte tests
* Remove test that is included in slt
* Enhance optimization for recursive queries by adding checks for problematic structures
* Refactor recursive CTE tests to improve projection pushdown validation and add new test for alias instability
* refactor: rename function to clarify purpose of subquery alias detection in recursive queries
* test: add comments to clarify purpose of subquery alias handling in recursive CTE tests
* Refactor `plan_contains_subquery_alias` to count subquery aliases instead of returning a bool
- Changed `plan_contains_subquery_alias` to count occurrences of `SubqueryAlias` nodes in the plan.
- Added helper function `count_subquery_aliases` to recursively update the count.
- Updated logic to return true if there are two or more subquery aliases, preserving original behavior in `recursive_cte_alias_instability` test.
* Optimize subquery alias counting by early termination
- Updated `plan_contains_subquery_alias` to short-circuit counting once threshold (2) is reached.
- Modified `count_subquery_aliases` to accept a threshold and return early when count meets or exceeds it.
- Improves performance by avoiding unnecessary traversal of the entire plan.
- Added integration test snapshot for recursive query column pruning reflecting these changes.
* Refactor count_subquery_aliases to return count instead of using mutable reference
- Changed count_subquery_aliases to take count usize and return updated count, removing mutable reference.
- Added early-exit when count reaches threshold to avoid unnecessary traversal.
- Updated is_projection_unnecessary to use new count_subquery_aliases signature.
- Removed obsolete integration test snapshot file.
- Fixed recursive CTE plan and physical plan in optimizer integration and sqllogictest to include projection pruning on TableScans inside recursive queries.
This improves clarity, readability, and efficiency of subquery alias counting in logical plans, and fixes recursive query projections pruning as per related issue.
* docs(tests): add note referencing similar SQL in cte.slt for recursive CTE alias test
Add a comment in the recursive_cte_alias_instability test highlighting the similarity of the SQL query to one in datafusion/sqllogictest/test_files/cte.slt. This improves test clarity and cross-reference for maintainers.
* fix: clarify comment regarding alias ambiguity in recursive CTEs
feat: add initial example for explain_memory
* refactor: remove recursive_cte test module
* fix: add comment for clarity on recursive query handling
This commit adds a comment to the `optimize_projections` function in the `mod.rs` file to clarify the handling of recursive queries. The comment references a discussion on GitHub related to the implementation.
* remove stray file
* Revert "refactor: remove recursive_cte test module"
This reverts commit d8f6dd1.
* remove RecursiveQuery bypass
* refactor: improve handling of non-CTE subqueries in recursive queries
* refactor: remove unused recursive_cte module from SQL tests
* refactor: enhance projection optimization by handling non-CTE subqueries in recursive queries
* refactor: simplify TableScan projections in recursive query tests
* refactor: reorder filter and projection in recursive query logical plan
* refactor: restrict projection pushdown in recursive queries to only allow CTE references
* refactor: optimize TableScan projection in recursive query logical plan
* fix: correct dynamic filter predicate order in hash join test
* Update datafusion/optimizer/src/optimize_projections/mod.rs
Co-authored-by: Jeffrey Vo <[email protected]>
* Update datafusion/optimizer/src/optimize_projections/mod.rs
Co-authored-by: Jeffrey Vo <[email protected]>
* Enhance test description for recursive CTE projection pushdown
* Rename plan_contains_non_cte_subquery to plan_contains_other_subqueries for clarity
* Add test for recursive CTE with nested subquery to validate projection pushdown behavior
* Remove recursive_query_column_pruning
---------
Co-authored-by: Jeffrey Vo <[email protected]>
0 commit comments