[SPARK-54449][CORE] Storage Memory off heap size should be considered only when off heap is enabled#53383
[SPARK-54449][CORE] Storage Memory off heap size should be considered only when off heap is enabled#53383VindhyaG wants to merge 20 commits intoapache:masterfrom
Conversation
095d819 to
db76a5d
Compare
dde0469 to
7f06280
Compare
7f06280 to
c701ee2
Compare
|
@VindhyaG |
@qiaojizhen I did not get what you mean by Driver won't aply offheap in cluster mode. Is it not supported in cluster mode you mean? |
@qiaojizhen Sorry I did not get what is the exact issue here? The value shown for the total memory size should get fixed with the changes done already. |
| bmConf.set(TEST_MEMORY, maxMem) | ||
| bmConf.set(MEMORY_OFFHEAP_SIZE, maxMem) | ||
| if(maxMem > 0) { | ||
| bmConf.set(MEMORY_OFFHEAP_ENABLED, true) |
There was a problem hiding this comment.
Why does MEMORY_OFFHEAP_ENABLED depend on maxMem?
There was a problem hiding this comment.
This is for the UTs wtih the tests where if maxMem set specifically greather than 0 only then set the MEMORY_OFFHEAP_ENABLED to true else keep it default false. Earlier MEMORY_OFFHEAP_ENABLED was always set to default false and only MEMORY_OFFHEAP_SIZE value was used to test all the combinations because earlier MEMORY_OFFHEAP_ENABLED true needed MEMORY_OFFHEAP_SIZE > 0 and not vice versa. MEMORY_OFFHEAP_ENABLED value did not matter at all for MEMORY_OFFHEAP_SIZE tests. We cannot do that with this change where if MEMORY_OFFHEAP_ENABLED is false maxMem is essentially zero as well.
Ngone51
left a comment
There was a problem hiding this comment.
The fix looks good to me overall. Could you post the UI screenshots before and after the fix to show the straightforward difference from UI?
@VindhyaG Yes! The off-heap memory won't work for Driver in any cluster mode! when I submit the job to YARN, the Driver container only gets 2G memory(including spark.driver.memory & spark.driver.memoryOverhead): but the web ui still display the off-heap memory: and the soucre code here Spark Yarn Client has proved that it will only use spark.driver.memory & spark.driver.memoryOverhead when applying the AM memory: |
@Ngone51 Before the change
After
|
@qiaojizhen So basically, Driver will not have off heap no matter the setting. Is that correct? Should we not have something like NA/null (- in UI) will be more clear I suppose. Or should we just make it 0? |
@VindhyaG Yes, in cluster mode, the Off-Heap of the Driver does not take effect regardless of whether the setting is enabled or not. I think "make it 0" is consistent with the existing logic. |
|
@qiaojizhen have done the changes for the driver Off heap values. Can you please revew the PR ? |
|
@dongjoon-hyun can you please help point to a relevant reviewer for this PR? |
holdenk
left a comment
There was a problem hiding this comment.
Seems reasonbale ish, although should we log a warning when off-heap is configured and disabled?
| if (event.blockManagerId.executorId == SparkContext.DRIVER_IDENTIFIER) { | ||
| exec.totalOffHeap = 0 |
There was a problem hiding this comment.
Is it true the driver would never use off heap storage?
There was a problem hiding this comment.
@holdenk The reference I took from https://spark.apache.org/docs/latest/configuration.html mentions only executor. @qiaojizhen can you please confirm again?














What changes were proposed in this pull request?
Currently, even if spark.memory.offHeap.enabled=false
spark.memory.offHeap.size is calculated for SPARK UI and wherever the config is used. This PR changes the logic to take the value set in spark.memory.offHeap.size only when spark.memory.offHeap.enabled
Why are the changes needed?
Change the logic to get the config of spark.memory.offHeap.size by checking spark.memory.offHeap.enabled first
Does this PR introduce any user-facing change?
Yes. Spark UI will show the total of Onheap and Off heap memory only if spark.memory.offHeap.enabled = true.
How was this patch tested?
Existing UTs
Was this patch authored or co-authored using generative AI tooling?
No