-
Notifications
You must be signed in to change notification settings - Fork 14
Closed
Description
As mentioned in #163 (comment), this project leaves the distributed DataFusion plan in a state where it does not play well with other optimization rules in the DataFusion ecosystem.
The truth is that we are not playing by the DataFusion rules, and after distributed planning, we pretty much render the execution plan useless besides execution and display:
- The plan becomes just a tree of stages, making it impossible to perform further traversals to insides of a stage with just DataFusion tools.
- Our execution plans do not support to be called with new arbitrary children. The .with_new_children() call is either not supported or does not accept any arbitrary plan in our nodes.
- Our nodes need to be prepared to take any arbitrary node as a child. Some crates will wrap nodes in wrapper passthrough nodes, so downcasting children to specific types will fail.
This prevents the project from working well with other crates like https://github.com/datafusion-contrib/datafusion-tracing.
Ideally, the produced distributed plan should allow traversals and operations as any other non-distributed plan.
Metadata
Metadata
Assignees
Labels
No labels