Reduce overhead of `Partitioned` hash join

# Is your feature request related to a problem or challenge?

Currently, a high part of the cost of the `Partitioned` join is repartitioning the entire build and probe side of the join - which currently copies all of the columns twice(!) during `take` and `coalesce`, this makes this hash join slower if this side is large/wide or both. We could avoid one copy in `RepartitionExec` (using arrow-rs coalesce API when fully implemented), but not two.
 
`CollectLeft` avoids this cost at the right side, but the building phase is single threaded, which greatly limits the parallelism in the query - the query/partitions needs to wait until the entire build side is finished, or some buffering/eager evaluation as implemented here https://github.com/apache/datafusion/pull/19761 might yield some 

# Describe the solution you'd like

We should be able to do the `hash % partition` of the probe side **during** the join, avoiding the need for a `RepartitionExec`. We only pass the indices of the matching partitions to the correct partition in the left - which should greatly reduce the overhead of the repartitioning.

When this is implemented, we might want to look at `hash_join_single_partition_threshold` and `hash_join_single_partition_threshold_rows` again which could be reduced to make most joins run fully in parallel./

_Originally posted by @Dandandan in https://github.com/apache/datafusion/issues/19761#issuecomment-3743995482_
            

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce overhead of `Partitioned` hash join #19789

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reduce overhead of Partitioned hash join #19789

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Reduce overhead of `Partitioned` hash join #19789