-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Description
In Velox today, HashJoin is used only when there is an equality predicate in the join condition . When there isn't any equality condition, NestedLoop Join is used.
In Presto, there is an operator for LookupJoin as well. This is the probe side lookup. There is a HashBuilder operator on the build side. This HashBuilder has fields for sortChannel and searchFunctionFactories ref that can be used to search specific values and follow sortLinks to find a range of values to match a condition. This allows for optimizations like prestodb/presto#8614 for non-equality joins. Link in LocalExecutionPlanner https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/sql/planner/LocalExecutionPlanner.java#L2374
In my mind, another similar option is to implement a B-tree in Velox.
Would be great if such functionality were available. It could be used in other operators/connectors as well.
It would give great performance, have reasonable spilling behavior and restrict NestedLoopJoin to cross joins only.