You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This change adds support for displaying a distributed EXPLAIN ANALYZE output with metrics. It updates the TPCH
validation tests to assert the EXPLAIN ANALYZE output for each query. I collected the single node
results [here](https://github.com/datafusion-contrib/datafusion-distributed/blob/js/full-explain-analyze-rebased-comparison/tests/tpch_validation_test.rs) - using the `output_rows` metric values, we can cross check
that the distributed plan metrics are correct.
Implemenation notes:
- Adds `src/explain.rs` to stores the main entrypoint to rendering the output string
- We now re-write the whole plan before rendering using `rewrite_distributed_plan_with_metrics()` which
is a new public API that takes an executed `DistributedExec` and wraps each node in the appropriate metrics
- A user can call this method on an executed plan and traverse it to collect metrics from particular nodes
(This may be hard though because all nodes are wrapped in MetricsWrapperExec...)
- Adds a `Option<DisplayCtx>` field to `DistributedExec` to contain extra display settings
- We use this to smuggle the information into `display_plan_ascii` because its only relevant in the
distributed case
- Significantly refactors TaskMetricsRewriter -> StageMetricsWriter. See comment.
Informs: #123
Other follow up work:
- #185
- #184
- #188
- #189
- #190
0 commit comments