[SPARK-55052][SQL] Add AQEShuffleRead properties to Physical Plan Tree#53817
[SPARK-55052][SQL] Add AQEShuffleRead properties to Physical Plan Tree#53817erenavsarogullari wants to merge 1 commit intoapache:masterfrom
Conversation
JIRA Issue Information=== Task SPARK-55052 === This comment was automatically generated by GitHub Actions |
1fc588c to
073a2f1
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds AQEShuffleRead properties (local, coalesced, skewed, coalesced and skewed) to the Physical Plan Tree output, making it easier to identify shuffle read characteristics in complex query plans without correlating details from separate sections.
Changes:
- Modified
AQEShuffleReadExec.simpleStringWithNodeId()to append shuffle read properties to the node string - Added a unit test to verify the new properties appear correctly in the formatted explain output
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEShuffleReadExec.scala | Overrides simpleStringWithNodeId() to append shuffle read properties (from stringArgs) to the node description |
| sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala | Adds test case to verify that AQEShuffleRead properties are correctly displayed in the physical plan tree with coalesced and skewed partitions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEShuffleReadExec.scala
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
Outdated
Show resolved
Hide resolved
499788a to
2ceab71
Compare
|
Hi @cloud-fan and @yaooqinn, |
sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/ExplainSuite.scala
Outdated
Show resolved
Hide resolved
1d7990d to
6430b75
Compare
6430b75 to
f7c09de
Compare
What changes were proposed in this pull request?
AQEShuffleReadcan havelocal/coalesced/skewed/coalesced and skewedproperties when reading shuffle files. When Physical Plan Tree is complex, it is hard to track this info by correlating with AQEShuffleRead details such as which AQEShuffleRead has local read or skewed partition info etc. For example, following skewed SortMergeJoin case, this helps to understand which SMJ leg has AQEShuffleRead with skew. This addition aims to access this kind of use-cases at physical plan tree level. Plan Tree details section per AQEShuffleRead node also shows these properties but when query plan tree is too complex (e.g: composed by 1000+ physical nodes), it is hard to correlate this information with AQEShuffleRead details.Current Physical Plan Tree:
New Physical Plan Tree:
Why are the changes needed?
When physical plan tree is complex (e.g: composed by 1000+ physical nodes), it is hard to correlate this information with
AQEShuffleReaddetails.Does this PR introduce any user-facing change?
Yes, when the user investigates the physical plan, new AQEShuffleRead properties will be seen at Physical Plan Tree.
How was this patch tested?
Added a new UT
Was this patch authored or co-authored using generative AI tooling?
No