[SPARK-55353][SQL] Disable SQLAppStatusListener when UI is disabled#54133
[SPARK-55353][SQL] Disable SQLAppStatusListener when UI is disabled#54133ever4Kenny wants to merge 1 commit intoapache:masterfrom
Conversation
JIRA Issue Information=== Improvement SPARK-55353 === This comment was automatically generated by GitHub Actions |
c44cf81 to
7086200
Compare
How much memory decreased in your case?
Do you have a concrete list of the impact? |
|
| "many stages and tasks, but will result in limited SQL execution information in the UI.") | ||
| .version("4.0.0") | ||
| .booleanConf | ||
| .createWithDefault(true) |
There was a problem hiding this comment.
Spark currently under 4.2 development cycle so version should be "4.2.0".
I agree appStatusListener should keep enabled by default. But could you disable it and tigger a round CI, to help us evaluate the impact?
There was a problem hiding this comment.
Ok, I will try disable it and trigger a CI, and enable it afterwards.
There was a problem hiding this comment.
Few metrics related tests failed with the listener disabled, could you please take a look?
| } else { | ||
| // When disabled, create a minimal status store without listener | ||
| new SQLAppStatusStore(kvStore, None) | ||
| } |
There was a problem hiding this comment.
If this is specific to UI, control it using spark.ui.enabled and avoid introducing a config ? Similar (as a pattern) to streamingQueryStatusListener below.
There was a problem hiding this comment.
Good point, I will use "spark.ui.enabled" instead.
There was a problem hiding this comment.
I'm a little confused, I believe the author means: UI almost works even if appStatusListener is disabled.
If so, separate configs make sense.
There was a problem hiding this comment.
I'm a little confused, I believe the author means: UI almost works even if appStatusListener is disabled.
If so, separate configs make sense.
Yes that's true, the UI is still available without the listener.
There was a problem hiding this comment.
the impacts looks fine! (you may need to rebase your branch to the latest master to solve the Python failures)
So how about
- mark the new config
spark.sql.ui.appStatusListener.enabledasinternal, to make it for advanced users only, as it has side effects that might confuse normal users. - keep the default value fallback to
spark.ui.enabled, then disabling UI also reduces driver memory usage.
@mridulm @cloud-fan, WDYT? Does disabling appStatusListener have other potential impacts other than UI?
0ee4e60 to
f92f2b7
Compare
| @@ -236,6 +236,15 @@ object StaticSQLConf { | |||
| .booleanConf | |||
| .createWithDefault(false) | |||
|
|
|||
| val SQL_UI_APP_STATUS_LISTENER_ENABLED = | |||
| buildStaticConf("spark.sql.ui.appStatusListener.enabled") | |||
There was a problem hiding this comment.
appStatusListener is internal, we should not expose it to users. So this is to disable SQL UI?
There was a problem hiding this comment.
Ok, directly use spark.ui.enabled to toggle the lisener.
There was a problem hiding this comment.
I believe what @cloud-fan said "disable SQL UI" means "disable SQLTab", not the whole live UI.
You probably need a new config like spark.sql.ui.enabled
f92f2b7 to
d121348
Compare
ff5aeae to
9281179
Compare
When spark.ui.enabled is set to false, SQLAppStatusListener should not be created to avoid unnecessary memory overhead. This can significantly reduce driver memory usage for complex queries with many stages and tasks. The change checks spark.ui.enabled configuration value instead of creating a separate configuration, reusing the existing spark.ui.enabled setting to control SQLAppStatusListener lifecycle.
9281179 to
5d31b42
Compare
What changes were proposed in this pull request?
This patch modifies SharedState to skip SQLAppStatusListener creation when Spark UI is disabled (spark.ui.enabled=false).
Why are the changes needed?
Complex SQL queries with many stages and tasks cause driver OOM due to excessive AccumulatorMetadata objects held by SQLAppStatusListener.
Memory regression from Spark 3.2 to 3.5:
By reusing the existing spark.ui.enabled configuration, users can disable UI to reduce memory overhead without needing a separate configuration.
Does this PR introduce any user-facing change?
No new configuration is added. The behavior change is:
When spark.ui.enabled=false:
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
No.