-
Notifications
You must be signed in to change notification settings - Fork 268
feat: Add plan conversion statistics to extended explain info #2412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2412 +/- ##
============================================
+ Coverage 56.12% 58.54% +2.41%
- Complexity 976 1445 +469
============================================
Files 119 146 +27
Lines 11743 13534 +1791
Branches 2251 2356 +105
============================================
+ Hits 6591 7923 +1332
- Misses 4012 4379 +367
- Partials 1140 1232 +92 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Subset of output from benchmarks |
| --conf spark.comet.explain.enabled=true \ | ||
| --conf spark.memory.offHeap.enabled=true \ | ||
| --conf spark.memory.offHeap.size=16g | ||
| --conf spark.memory.offHeap.size=2g |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change, but we don't need 16g for this simple example
|
|
||
| withSQLConf(CometConf.COMET_EXPLAIN_VERBOSE_ENABLED.key -> "true") { | ||
| val extendedExplain = new ExtendedExplainInfo().generateExtendedInfo(cometPlan) | ||
| assert(extendedExplain.contains("Comet accelerated 33% of eligible operators")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be this number fluctuating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is currently stable across all Spark versions that we test with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wrong. There is a failure due to a different percentage. I will make the test less specific.
- DPP fallback *** FAILED *** (1 second, 553 milliseconds)
"BroadcastHashJoin
:- ColumnarToRow
: +- Scan parquet [COMET: Dynamic Partition Pruning is not supported]
: +- SubqueryBroadcast
: +- BroadcastExchange
: +- CometColumnarToRow
: +- CometFilter
: +- CometScan [native_iceberg_compat] parquet
+- BroadcastExchange
+- CometColumnarToRow
+- CometFilter
+- CometScan [native_iceberg_compat] parquet
Comet accelerated 4 out of 9 eligible operators (44%). Final plan contains 3 transitions between Spark and Comet." did not contain "Comet accelerated 33% of eligible operators" (CometExecSuite.scala:124)
spark/src/main/scala/org/apache/comet/ExtendedExplainInfo.scala
Outdated
Show resolved
Hide resolved
From the user perspective it might be not very clear what is accelerated? which is not the case? |
Thanks, I will add documentation as part of this PR |
| var transitions: Int = 0 | ||
|
|
||
| override def toString: String = { | ||
| s"sparkOperators=$sparkOperators, cometOperators=$cometOperators, " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could use a more verbose string here so that the meaning of these stats is a little more obvious
|
We could potentially add a method to get the plan stats independent of the explain string too. I can see this being useful in writing some tool to analyse queries/plans without executing them. |
|
Thanks for the reviews @comphead and @parthchandra. Could you take another look? |
spark/src/main/scala/org/apache/comet/ExtendedExplainInfo.scala
Outdated
Show resolved
Hide resolved
parthchandra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Minor improvement suggested. Also, is the comment re the unit test addressed?
Co-authored-by: Parth Chandra <[email protected]>
|
Thanks for the reviews @parthchandra @comphead. I have addressed feedback and plan on merging this later today, unless you have any other feedback |
Which issue does this PR close?
N/A
Rationale for this change
I would like to make it easier to see what % of operators are accelerated by Comet.
What changes are included in this PR?
Add some basic statistics to the "extended explain" output to show % of operators accelerated by Comet.
Example output:
How are these changes tested?
Manually