@@ -45,10 +45,19 @@ and requiring shuffle memory to be separately configured.
4545The recommended way to allocate memory for Comet is to set ` spark.memory.offHeap.enabled=true ` . This allows
4646Comet to share an off-heap memory pool with Spark, reducing the overall memory overhead. The size of the pool is
4747specified by ` spark.memory.offHeap.size ` . For more details about Spark off-heap memory mode, please refer to
48- Spark documentation: https://spark.apache.org/docs/latest/configuration.html .
48+ [ Spark documentation] . For full details on configuring Comet memory in off-heap mode, see the [ Advanced Memory Tuning]
49+ section of this guide.
50+
51+ [ Spark documentation ] : https://spark.apache.org/docs/latest/configuration.html
4952
5053### Configuring Comet Memory in On-Heap Mode
5154
55+ ``` {warning}
56+ Support for on-heap memory pools is deprecated and will be removed from a future release.
57+ ```
58+
59+ Comet is disabled by default in on-heap mode, but can be enabled by setting ` spark.comet.exec.onheap.enabled=true ` .
60+
5261When running in on-heap mode, Comet memory can be allocated by setting ` spark.comet.memoryOverhead ` . If this setting
5362is not provided, it will be calculated by multiplying the current Spark executor memory by
5463` spark.comet.memory.overhead.factor ` (default value is ` 0.2 ` ) which may or may not result in enough memory for
@@ -59,10 +68,13 @@ Comet supports native shuffle and columnar shuffle (these terms are explained in
5968In on-heap mode, columnar shuffle memory must be separately allocated using ` spark.comet.columnar.shuffle.memorySize ` .
6069If this setting is not provided, it will be calculated by multiplying ` spark.comet.memoryOverhead ` by
6170` spark.comet.columnar.shuffle.memory.factor ` (default value is ` 1.0 ` ). If a shuffle exceeds this amount of memory
62- then the query will fail.
71+ then the query will fail. For full details on configuring Comet memory in on-heap mode, see the [ Advanced Memory Tuning]
72+ section of this guide.
6373
6474[ shuffle ] : #shuffle
6575
76+ [ Advanced Memory Tuning ] : #advanced-memory-tuning
77+
6678### Determining How Much Memory to Allocate
6779
6880Generally, increasing the amount of memory allocated to Comet will improve query performance by reducing the
@@ -102,14 +114,6 @@ Workarounds for this problem include:
102114
103115## Advanced Memory Tuning
104116
105- ### Configuring spark.executor.memoryOverhead in On-Heap Mode
106-
107- In some environments, such as Kubernetes and YARN, it is important to correctly set ` spark.executor.memoryOverhead ` so
108- that it is possible to allocate off-heap memory when running in on-heap mode.
109-
110- Comet will automatically set ` spark.executor.memoryOverhead ` based on the ` spark.comet.memory* ` settings so that
111- resource managers respect Apache Spark memory configuration before starting the containers.
112-
113117### Configuring Off-Heap Memory Pools
114118
115119Comet implements multiple memory pool implementations. The type of pool can be specified with ` spark.comet.exec.memoryPool ` .
@@ -132,6 +136,10 @@ when there is sufficient memory in order to leave enough memory for other operat
132136
133137### Configuring On-Heap Memory Pools
134138
139+ ``` {warning}
140+ Support for on-heap memory pools is deprecated and will be removed from a future release.
141+ ```
142+
135143When running in on-heap mode, Comet will use its own dedicated memory pools that are not shared with Spark.
136144
137145The type of pool can be specified with ` spark.comet.exec.memoryPool ` . The default setting is ` greedy_task_shared ` .
@@ -172,6 +180,14 @@ adjusting how much memory to allocate.
172180[ FairSpillPool ] : https://docs.rs/datafusion/latest/datafusion/execution/memory_pool/struct.FairSpillPool.html
173181[ UnboundedMemoryPool ] : https://docs.rs/datafusion/latest/datafusion/execution/memory_pool/struct.UnboundedMemoryPool.html
174182
183+ ### Configuring spark.executor.memoryOverhead in On-Heap Mode
184+
185+ In some environments, such as Kubernetes and YARN, it is important to correctly set ` spark.executor.memoryOverhead ` so
186+ that it is possible to allocate off-heap memory when running in on-heap mode.
187+
188+ Comet will automatically set ` spark.executor.memoryOverhead ` based on the ` spark.comet.memory* ` settings so that
189+ resource managers respect Apache Spark memory configuration before starting the containers.
190+
175191## Optimizing Joins
176192
177193Spark often chooses ` SortMergeJoin ` over ` ShuffledHashJoin ` for stability reasons. If the build-side of a
0 commit comments