Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements Phase 1 of sort pushdown optimization to improve TopK query performance. When a query requests data in reverse order of a Parquet file's natural ordering, the optimizer now enables reverse row group scanning, which allows early termination in TopK queries while keeping the Sort operator for correctness.
Key changes:
- Adds
enable_sort_pushdownconfiguration option (default: true) - Implements reverse row group scanning for Parquet files
- Returns inexact ordering to enable TopK early termination benefits
- Adds comprehensive test coverage across multiple file formats
Reviewed changes
Copilot reviewed 28 out of 29 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| docs/source/user-guide/configs.md | Documents new configuration options including enable_sort_pushdown, force_filter_selections, enable_ansi_mode, and hash join InList pushdown settings |
| datafusion/common/src/config.rs | Adds enable_sort_pushdown configuration option with detailed documentation |
| datafusion/physical-optimizer/src/pushdown_sort.rs | Implements the PushdownSort optimizer rule that detects SortExec nodes and attempts to push sort requirements down to data sources |
| datafusion/physical-plan/src/sort_pushdown.rs | Defines SortOrderPushdownResult enum for communicating sort pushdown results (Exact, Inexact, Unsupported) |
| datafusion/physical-plan/src/execution_plan.rs | Adds try_pushdown_sort trait method to ExecutionPlan for sort optimization |
| datafusion/datasource-parquet/src/source.rs | Implements reverse row group scanning logic in ParquetSource with reverse_row_groups field |
| datafusion/datasource-parquet/src/sort.rs | Implements reverse_row_selection function to adjust row selections for reversed row group order |
| datafusion/datasource-parquet/src/opener.rs | Integrates reverse scanning into ParquetOpener using PreparedAccessPlan |
| datafusion/physical-expr-common/src/sort_expr.rs | Adds is_reverse and is_reversed_sort_options helpers for detecting reversed orderings |
| datafusion/sqllogictest/test_files/*.slt | Comprehensive SQL logic tests validating reverse scan behavior with various scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fd45ae8 to
4ed0668
Compare
…#19557) - Closes [apache#19535](apache#19535) Reverse row selection should respect the row group index, this PR will fix the issue. Reverse row selection should respect the row group index, this PR will fix the issue. Yes No (cherry picked from commit 27de50d)
## Which issue does this PR close? Add sorted data benchmark. - Closes[ apache#18976](apache#18976) ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? Yes, test results for reverse parquet PR, it's 30X faster than main branch for sorted data: apache#18817 ```rust Running `/Users/zhuqi/arrow-datafusion/target/release/dfbench clickbench --iterations 5 --path /Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet --queries-path /Users/zhuqi/arrow-datafusion/benchmarks/queries/clickbench/queries/sorted_data --sorted-by EventTime --sort-order ASC -o /Users/zhuqi/arrow-datafusion/benchmarks/results/reverse_parquet/data_sorted_clickbench.json` Running benchmarks with the following options: RunOpt { query: None, pushdown: false, common: CommonOpt { iterations: 5, partitions: None, batch_size: None, mem_pool_type: "fair", memory_limit: None, sort_spill_reservation_bytes: None, debug: false }, path: "/Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet", queries_path: "/Users/zhuqi/arrow-datafusion/benchmarks/queries/clickbench/queries/sorted_data", output_path: Some("/Users/zhuqi/arrow-datafusion/benchmarks/results/reverse_parquet/data_sorted_clickbench.json"), sorted_by: Some("EventTime"), sort_order: "ASC" }⚠️ Forcing target_partitions=1 to preserve sort order⚠️ (Because we want to get the pure performance benefit of sorted data to compare) 📊 Session config target_partitions: 1 Registering table with sort order: EventTime ASC Executing: CREATE EXTERNAL TABLE hits STORED AS PARQUET LOCATION '/Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet' WITH ORDER ("EventTime" ASC) Q0: -- Must set for ClickBench hits_partitioned dataset. See apache#16591 -- set datafusion.execution.parquet.binary_as_string = true SELECT * FROM hits ORDER BY "EventTime" DESC limit 10; Query 0 iteration 0 took 14.7 ms and returned 10 rows Query 0 iteration 1 took 10.2 ms and returned 10 rows Query 0 iteration 2 took 8.7 ms and returned 10 rows Query 0 iteration 3 took 7.9 ms and returned 10 rows Query 0 iteration 4 took 7.9 ms and returned 10 rows Query 0 avg time: 9.85 ms + set +x Done ``` And the main branch result: ```rust Running `/Users/zhuqi/arrow-datafusion/target/release/dfbench clickbench --iterations 5 --path /Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet --queries-path /Users/zhuqi/arrow-datafusion/benchmarks/queries/clickbench/queries/sorted_data --sorted-by EventTime --sort-order ASC -o /Users/zhuqi/arrow-datafusion/benchmarks/results/issue_18976/data_sorted_clickbench.json` Running benchmarks with the following options: RunOpt { query: None, pushdown: false, common: CommonOpt { iterations: 5, partitions: None, batch_size: None, mem_pool_type: "fair", memory_limit: None, sort_spill_reservation_bytes: None, debug: false }, path: "/Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet", queries_path: "/Users/zhuqi/arrow-datafusion/benchmarks/queries/clickbench/queries/sorted_data", output_path: Some("/Users/zhuqi/arrow-datafusion/benchmarks/results/issue_18976/data_sorted_clickbench.json"), sorted_by: Some("EventTime"), sort_order: "ASC" }⚠️ Forcing target_partitions=1 to preserve sort order⚠️ (Because we want to get the pure performance benefit of sorted data to compare) 📊 Session config target_partitions: 1 Registering table with sort order: EventTime ASC Executing: CREATE EXTERNAL TABLE hits STORED AS PARQUET LOCATION '/Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet' WITH ORDER ("EventTime" ASC) Q0: -- Must set for ClickBench hits_partitioned dataset. See apache#16591 -- set datafusion.execution.parquet.binary_as_string = true SELECT * FROM hits ORDER BY "EventTime" DESC limit 10; Query 0 iteration 0 took 331.1 ms and returned 10 rows Query 0 iteration 1 took 286.0 ms and returned 10 rows Query 0 iteration 2 took 283.3 ms and returned 10 rows Query 0 iteration 3 took 283.8 ms and returned 10 rows Query 0 iteration 4 took 286.5 ms and returned 10 rows Query 0 avg time: 294.13 ms + set +x Done ``` ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com> Co-authored-by: Yongting You <2010youy01@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> (cherry picked from commit cde6dfa)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?