Commit 6d62650
committed
feat: add LazyPartitioned mode for hash join to reduce RepartitionExec overhead
This commit adds a new PartitionMode::LazyPartitioned that avoids the
full build-side RepartitionExec when executing partitioned hash joins.
Instead of pre-repartitioning all columns of the build table, rows are
filtered lazily during hash table construction using hash(join_keys) %
partition_count.
Key changes:
- Add LazyPartitioned variant to PartitionMode enum
- Build side requests UnspecifiedDistribution (merged, no repartition)
- Probe side still requests HashPartitioned distribution
- Add filter_batch_by_partition() to filter build rows per partition
- Update collect_left_input to accept optional partition filter
- Add protobuf serialization support for new mode
- Update optimizer to handle LazyPartitioned in key reordering
This optimization is beneficial for wide build tables where copying
all columns in RepartitionExec is expensive.
Closes #197891 parent 472a729 commit 6d62650
File tree
10 files changed
+299
-45
lines changed- datafusion
- physical-optimizer/src
- physical-plan/src/joins
- hash_join
- proto
- proto
- src
- generated
- physical_plan
10 files changed
+299
-45
lines changedLines changed: 37 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
327 | 327 | | |
328 | 328 | | |
329 | 329 | | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
330 | 358 | | |
331 | 359 | | |
332 | 360 | | |
| |||
624 | 652 | | |
625 | 653 | | |
626 | 654 | | |
627 | | - | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
628 | 659 | | |
629 | 660 | | |
630 | 661 | | |
| |||
645 | 676 | | |
646 | 677 | | |
647 | 678 | | |
648 | | - | |
| 679 | + | |
649 | 680 | | |
650 | 681 | | |
651 | 682 | | |
| |||
1257 | 1288 | | |
1258 | 1289 | | |
1259 | 1290 | | |
| 1291 | + | |
| 1292 | + | |
| 1293 | + | |
| 1294 | + | |
1260 | 1295 | | |
1261 | 1296 | | |
1262 | 1297 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
293 | 293 | | |
294 | 294 | | |
295 | 295 | | |
296 | | - | |
| 296 | + | |
297 | 297 | | |
298 | 298 | | |
299 | 299 | | |
| |||
302 | 302 | | |
303 | 303 | | |
304 | 304 | | |
305 | | - | |
| 305 | + | |
306 | 306 | | |
307 | 307 | | |
308 | 308 | | |
| |||
540 | 540 | | |
541 | 541 | | |
542 | 542 | | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
543 | 546 | | |
544 | 547 | | |
545 | 548 | | |
| |||
0 commit comments