Commit 8b964b0
committed
feat: support stateless physical plans
This patch introduces the stateless physical plan feature. Currently, the physical-plan crate
is fully supported. This feature allows for the reuse of physical plans and their concurrent
execution.
The feature is implemented by adding a separate Cargo feature named "stateless_plan".
The implementation consists of several parts:
* State tree.
With the "stateless_plan" feature enabled, the plans themselves do not store state. The state
is stored in a separate tree composed of PlanStateNodes, which is built lazily during plan execution.
Each node of the tree stores not only the shared state of the plan but also its metrics. The shape
of the state tree matches the shape of the execution plan tree.
* Metrics
Metrics are stored in the nodes of the state tree and can be accessed after plan execution. Support
is provided for performing EXPLAIN using the state.
* Dynamic Filters
In the case of stateless plans, dynamic filters cannot simply be stored inside the plans, as the same
plan can be executed concurrently. To overcome this, a dynamic filter is split into two parts: a planning-time
version and an execution-time version. The plans contain the planning-time version, which is transformed into
the execution version during the execution phase and then passed from parent nodes to child nodes using the
state tree.
* WorkTable
Instead of explicitly injecting the WorkTable into nodes, RecursiveExec exposes the WorkTable in the state stored
within the State Tree. Then, a node interested in obtaining the WorkTable traverses up the State Tree and thus
retrieves the current WorkTable.
Planned following work:
- Support stateless plan for all other DataFusion crates.
- Enable running tests with this feature in CI.
- Deprecate stateful plans to eventually transition completely to the stateless version.
- Add `fmt_as_with_state` to allow plans to include state-specific details in the EXPLAIN output, such as dynamic filters.
Closes apache#193511 parent 1acaf7a commit 8b964b0
File tree
52 files changed
+3232
-910
lines changed- datafusion
- datasource-parquet/src
- execution/src/metrics
- physical-expr/src/expressions
- physical-plan
- src
- aggregates
- group_values
- joins
- hash_join
- piecewise_merge_join
- sort_merge_join
- repartition
- sorts
- test
- topk
- windows
- pruning/src
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
52 files changed
+3232
-910
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1303 | 1303 | | |
1304 | 1304 | | |
1305 | 1305 | | |
1306 | | - | |
1307 | | - | |
1308 | | - | |
1309 | | - | |
| 1306 | + | |
| 1307 | + | |
1310 | 1308 | | |
1311 | 1309 | | |
1312 | 1310 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
92 | | - | |
93 | | - | |
94 | | - | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
95 | 95 | | |
96 | 96 | | |
97 | 97 | | |
| |||
213 | 213 | | |
214 | 214 | | |
215 | 215 | | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
216 | 226 | | |
217 | 227 | | |
218 | 228 | | |
| |||
0 commit comments