Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Jan 15, 2026

I am testing the PR from @askalt

In order to run the sql_planning benchmark, I also need the change from

I pull them all together into a PR so I can test it

I don't plan t merge this PR

askalt and others added 2 commits January 15, 2026 21:20
This patch aims to implement a fast-path for the ExecutionPlan::with_new_children function
for some plans, moving closer to a physical plan re-use implementation and improving planning
performance. If the passed children properties are the same as in self, we do not actually
recompute self's properties  (which could be costly if projection mapping is required).
Instead, we just replace the children and re-use self's properties as-is.

To be able to compare two different properties -- ExecutionPlan::properties(...) signature
is modified and now returns `&Arc<PlanProperties>`. If `children` properties are the same
in `with_new_children` -- we clone our properties arc and then a parent plan will consider
our properties as unchanged, doing the same.

Also, there are other improvements. The patch includes the following changes:

- Return `&Arc<PlanProperties>` from `ExecutionPlan::properties(...)` instead of a reference.
- Implement `with_new_children` fast-path if there is no children properties changes for all
  major plans.
- Store `Arc<[usize]>` instead of vector within `FilterExec`.
- Store `Arc<[usize]>` instead of vector within projection of `HashJoinExec`.
- Store `Arc<[Arc<AggregateFunctionExpr>]>` instead of vec for aggr expr and filters.
- Store `Arc<[ProjectionExpr]> instead of vec in `ProjectionExprs` struct.
- Get `Option<&[usize]>` instead of option vec ref in `project_schema` -- it makes API
  more flexible.

Note: currently, `reset_plan_states` does not allow to re-use plan in general: it is not
supported for dynamic filters and recursive queries features, as in this case state reset
should update pointers in the children plans.

Closes apache#19796
@alamb alamb changed the title TESTING PT TESTING PLANNING Jan 15, 2026
@alamb
Copy link
Contributor Author

alamb commented Jan 15, 2026

run benchmark sql_planner

@github-actions github-actions bot added documentation Improvements or additions to documentation physical-expr Changes to the physical-expr crates optimizer Optimizer rules core Core DataFusion crate catalog Related to the catalog crate common Related to common crate proto Related to proto crate datasource Changes to the datasource crate ffi Changes to the ffi crate physical-plan Changes to the physical-plan crate labels Jan 15, 2026
@alamb-ghbot
Copy link

🤖 ./gh_compare_branch_bench.sh compare_branch_bench.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/test_fast_and_fix (7323319) to 094e7ee diff
BENCH_NAME=sql_planner
BENCH_COMMAND=cargo bench --features=parquet --bench sql_planner
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_test_fast_and_fix
Results will be posted here when complete

@alamb-ghbot
Copy link

Benchmark script failed with exit code 101.

Last 10 lines of output:

Click to expand
  3 (3.00%) high severe

Benchmarking physical_plan_clickbench_q50
Benchmarking physical_plan_clickbench_q50: Warming up for 3.0000 s

thread 'main' (349589) panicked at datafusion/core/benches/sql_planner.rs:62:14:
called `Result::unwrap()` on an `Err` value: Context("type_coercion", Internal("Expect TypeSignatureClass::Native(LogicalType(Native(String), String)) but received NativeType::Binary, DataType: BinaryView"))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

error: bench failed, to rerun pass `-p datafusion --bench sql_planner`

@alamb
Copy link
Contributor Author

alamb commented Jan 15, 2026

(Gah, now the bench works on this branch, but not on main so the benchmark runner can't compare the results)

@alamb alamb closed this Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

catalog Related to the catalog crate common Related to common crate core Core DataFusion crate datasource Changes to the datasource crate documentation Improvements or additions to documentation ffi Changes to the ffi crate optimizer Optimizer rules physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate proto Related to proto crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants