-
Notifications
You must be signed in to change notification settings - Fork 14
Closed
Description
EXPLAIN ANALYZE is incredibly useful for understanding query performance, but there’s currently no defined behavior for distributed queries in DataFusion. This gives us the opportunity to define how it should work from the ground up.
Proposed approach:
- In each DFRayProcessor, generate an instrumented plan as if it were running EXPLAIN ANALYZE locally.
- While streaming results back across the network, attach the instrumented plan as an opaque payload in the response—so it can be collected at the head node for final formatting.
- Investigate the use of opaque fields in the Arrow Flight protocol to carry this metadata.
This will give developers deep insight into execution performance across stages and workers in a distributed setup.
Metadata
Metadata
Assignees
Labels
No labels