@@ -24,10 +24,20 @@ capabilities with minimal changes.
2424
2525## Architecture
2626
27+ Before diving into the architecture, it's important to clarify some terms and what they mean:
28+
29+ - ` worker ` : a physical machine listening to serialized plans over an Arrow Flight interface.
30+ - ` network boundary ` : a node in the plan that streams data from a network interface rather than directly from its
31+ children. Implemented as an ` ArrowFlightReadExec ` physical DataFusion node.
32+ - ` stage ` : a portion of the plan separated by a network boundary from other parts of the plan. Implemented as any
33+ other physical node in DataFusion.
34+ - ` task ` : a unit of work inside a stage that executes a subset of its partitions in a specific worker.
35+ - ` subplan ` : a slice of the overall plan
36+
2737A distributed DataFusion query is executed in a very similar fashion as a normal DataFusion query with one key
2838difference:
2939
30- The physical plan is divided into stages that can be executed on different machines and exchange data using Arrow
40+ The physical plan is divided into stages that can be executed on different workers and exchange data using Arrow
3141Flight. All of this is done at the physical plan level, and is implemented as a ` PhysicalOptimizerRule ` that:
3242
33431 . Inspects the non-distributed plan, placing network boundaries (` ArrowFlightReadExec ` nodes) in the appropriate places
@@ -111,7 +121,7 @@ shows the partitioning scheme in each node):
111121 └──────[0][1][2]───────┘
112122```
113123
114- Based on that boundary, the plan is divided into stages and tasks are assigned to each stage. Each task will be
124+ Based on that boundary, the plan is divided into stages, and tasks are assigned to each stage. Each task will be
115125responsible for the different partitions in the original plan.
116126
117127```
@@ -131,11 +141,11 @@ responsible for the different partitions in the original plan.
131141 │ │ │
132142 ┌─── task 0 (stage 1) ───┐ ┌── task 1 (stage 1) ────┐ ┌── task 2 (stage 1) ────┐
133143 │ │ │ │ │ │ │ │ │
134- │┌─────────[0]──────────┐│ │┌─────────[0 ]──────────┐│ │┌──────────[0 ]─────────┐│
144+ │┌─────────[0]──────────┐│ │┌─────────[1 ]──────────┐│ │┌──────────[2 ]─────────┐│
135145 ││ AggregateExec ││ ││ AggregateExec ││ ││ AggregateExec ││
136146 ││ (partial) ││ ││ (partial) ││ ││ (partial) ││
137147 │└──────────┬───────────┘│ │└──────────┬───────────┘│ │└───────────┬──────────┘│
138- │┌─────────[0]──────────┐│ │┌─────────[0 ]──────────┐│ │┌──────────[0 ]─────────┐│
148+ │┌─────────[0]──────────┐│ │┌─────────[1 ]──────────┐│ │┌──────────[2 ]─────────┐│
139149 ││ DataSourceExec ││ ││ DataSourceExec ││ ││ DataSourceExec ││
140150 │└──────────────────────┘│ │└──────────────────────┘│ │└──────────────────────┘│
141151 └────────────────────────┘ └────────────────────────┘ └────────────────────────┘
@@ -149,7 +159,7 @@ This means that:
149159
1501601 . The head stage is executed normally as if the query was not distributed.
1511612 . Upon calling ` .execute() ` on the ` ArrowFlightReadExec ` , instead of recursively calling ` .execute() ` on its children,
152- they will be serialized and sent over the wire to another node.
162+ the child subplan will be serialized and sent over the wire to another node.
1531633 . The next node, which is hosting an Arrow Flight Endpoint listening for gRPC requests over an HTTP server, will pick
154164 up the request containing the serialized chunk of the overall plan and execute it.
1551654 . This is repeated for each stage, and data will start flowing from bottom to top until it reaches the head stage.
@@ -167,7 +177,7 @@ There are some runnable examples showcasing how to provide a localhost implement
167177The integration tests also provide an idea about how to use the library and what can be achieved with it:
168178
169179- [ tpch_validation_test.rs] ( tests/tpch_validation_test.rs ) : executes all TPCH queries and performs assertions over the
170- distributed plans.
180+ distributed plans and the results vs running the queries in single node mode with a small scale factor .
171181- [ custom_config_extension.rs] ( tests/custom_config_extension.rs ) : showcases how to propagate custom DataFusion config
172182 extensions.
173183- [ custom_extension_codec.rs] ( tests/custom_extension_codec.rs ) : showcases how to propagate custom physical extension
0 commit comments