Skip to content

Commit f3e92f7

Browse files
Merge pull request #219396 from sreekzz/patch-127
Removed HDI 3.6 mentions as it is retired
2 parents 7e9918f + a85a614 commit f3e92f7

File tree

1 file changed

+22
-22
lines changed

1 file changed

+22
-22
lines changed

articles/hdinsight/interactive-query/hive-llap-sizing-guide.md

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ ms.service: hdinsight
55
ms.topic: troubleshooting
66
author: reachnijel
77
ms.author: nijelsf
8-
ms.date: 07/19/2022
8+
ms.date: 11/23/2022
99
---
1010

1111
# Azure HDInsight Interactive Query Cluster (Hive LLAP) sizing guide
@@ -17,7 +17,7 @@ specific tuning.
1717

1818
| Node Type | Instance | Size |
1919
| :--- | :----: | :--- |
20-
| Head | D13 v2 | 8 vcpus, 56 GB RAM, 400 GB SSD |
20+
| Head | D13 v2 | 8 vcpus, 56-GB RAM, 400 GB SSD |
2121
| Worker | **D14 v2** | **16 vcpus, 112 GB RAM, 800 GB SSD** |
2222
| ZooKeeper | A4 v2 | 4 vcpus, 8-GB RAM, 40 GB SSD |
2323

@@ -30,7 +30,7 @@ specific tuning.
3030
| yarn.scheduler.maximum-allocation-mb | 102400 (MB) | The maximum allocation for every container request at the RM, in MBs. Memory requests higher than this value won't take effect |
3131
| yarn.scheduler.maximum-allocation-vcores | 12 |The maximum number of CPU cores for every container request at the Resource Manager. Requests higher than this value won't take effect. |
3232
| yarn.nodemanager.resource.cpu-vcores | 12 | Number of CPU cores per NodeManager that can be allocated for containers. |
33-
| yarn.scheduler.capacity.root.llap.capacity | 85 (%) | YARN capacity allocation for llap queue |
33+
| yarn.scheduler.capacity.root.llap.capacity | 85 (%) | YARN capacity allocation for LLAP queue |
3434
| tez.am.resource.memory.mb | 4096 (MB) | The amount of memory in MB to be used by the tez AppMaster |
3535
| hive.server2.tez.sessions.per.default.queue | <number_of_worker_nodes> |The number of sessions for each queue named in the hive.server2.tez.default.queues. This number corresponds to number of query coordinators(Tez AMs) |
3636
| hive.tez.container.size | 4096 (MB) | Specified Tez container size in MB |
@@ -70,7 +70,7 @@ For D14 v2, the recommended value is **12**.
7070
#### **4. Number of concurrent queries**
7171
Configuration: ***hive.server2.tez.sessions.per.default.queue***
7272

73-
This configuration value determines the number of Tez sessions that can be launched in parallel. These Tez sessions will be launched for each of the queues specified by "hive.server2.tez.default.queues". It corresponds to the number of Tez AMs (Query Coordinators). It's recommended to be the same as the number of worker nodes. The number of Tez AMs can be higher than the number of LLAP daemon nodes. The Tez AM's primary responsibility is to coordinate the query execution and assign query plan fragments to corresponding LLAP daemons for execution. Keep this value as multiple of a number of LLAP daemon nodes to achieve higher throughput.
73+
This configuration value determines the number of Tez sessions that can be launched in parallel. These Tez sessions will be launched for each of the queues specified by "hive.server2.tez.default.queues". It corresponds to the number of Tez AMs (Query Coordinators). It's recommended to be the same as the number of worker nodes. The number of Tez AMs can be higher than the number of LLAP daemon nodes. The Tez AM's primary responsibility is to coordinate the query execution and assign query plan fragments to corresponding LLAP daemons for execution. Keep this value as multiple of many LLAP daemon nodes to achieve higher throughput.
7474

7575
Default HDInsight cluster has four LLAP daemons running on four worker nodes, so the recommended value is **4**.
7676

@@ -90,9 +90,9 @@ The recommended value is **4096 MB**.
9090
#### **6. LLAP Queue capacity allocation**
9191
Configuration: ***yarn.scheduler.capacity.root.llap.capacity***
9292

93-
This value indicates a percentage of capacity given to llap queue. The capacity allocations may have different values for different workloads depending on how the YARN queues are configured. If your workload is read-only operations, then setting it as high as 90% of the capacity should work. However, if your workload is mix of update/delete/merge operations using managed tables, it's recommended to give 85% of the capacity for llap queue. The remaining 15% capacity can be used by other tasks such as compaction etc. to allocate containers from default queue. That way tasks in default queue won't deprive of YARN resources.
93+
This value indicates a percentage of capacity given to LLAP queue. The capacity allocations may have different values for different workloads depending on how the YARN queues are configured. If your workload is read-only operations, then setting it as high as 90% of the capacity should work. However, if your workload is mix of update/delete/merge operations using managed tables, it's recommended to give 85% of the capacity for LLAP queue. The remaining 15% capacity can be used by other tasks such as compaction etc. to allocate containers from default queue. That way tasks in default queue won't deprive of YARN resources.
9494

95-
For D14v2 worker nodes, the recommended value for llap queue is **85**.
95+
For D14v2 worker nodes, the recommended value for LLAP queue is **85**.
9696
(For readonly workloads, it can be increased up to 90 as suitable.)
9797

9898
#### **7. LLAP daemon container size**
@@ -104,7 +104,7 @@ LLAP daemon is run as a YARN container on each worker node. The total memory siz
104104
* Total memory configured for all containers on a node and LLAP queue capacity
105105

106106
Memory needed by Tez Application Masters(Tez AM) can be calculated as follows.
107-
Tez AM acts as a query coordinator and the number of Tez AMs should be configured based on a number of concurrent queries to be served. Theoretically, we can consider one Tez AM per worker node. However, its possible that you may see more than one Tez AM on a worker node. For calculation purpose, we assume uniform distribution of Tez AMs across all LLAP daemon nodes/worker nodes.
107+
Tez AM acts as a query coordinator and the number of Tez AMs should be configured based on many concurrent queries to be served. Theoretically, we can consider one Tez AM per worker node. However, it's possible that you may see more than one Tez AM on a worker node. For calculation purpose, we assume uniform distribution of Tez AMs across all LLAP daemon nodes/worker nodes.
108108
It's recommended to have 4 GB of memory per Tez AM.
109109

110110
Number of Tez Ams = value specified by Hive config ***hive.server2.tez.sessions.per.default.queue***.
@@ -116,27 +116,27 @@ For D14 v2, the default configuration has four Tez AMs and four LLAP daemon node
116116
Tez AM memory per node = (ceil(4/4) x 4 GB) = 4 GB
117117

118118
Total Memory available for LLAP queue per worker node can be calculated as follows:
119-
This value depends on the total amount of memory available for all YARN containers on a node(*yarn.nodemanager.resource.memory-mb*) and the percentage of capacity configured for llap queue(*yarn.scheduler.capacity.root.llap.capacity*).
120-
Total memory for LLAP queue on worker node = Total memory available for all YARN containers on a node x Percentage of capacity for llap queue.
119+
This value depends on the total amount of memory available for all YARN containers on a node(*yarn.nodemanager.resource.memory-mb*) and the percentage of capacity configured for LLAP queue(*yarn.scheduler.capacity.root.llap.capacity*).
120+
Total memory for LLAP queue on worker node = Total memory available for all YARN containers on a node x Percentage of capacity for LLAP queue.
121121
For D14 v2, this value is (100 GB x 0.85) = 85 GB.
122122

123123
The LLAP daemon container size is calculated as follows;
124124

125125
**LLAP daemon container size = (Total memory for LLAP queue on a workernode) – (Tez AM memory per node) - (Service Master container size)**
126126
There is only one Service Master (Application Master for LLAP service) on the cluster spawned on one of the worker nodes. For calculation purpose, we consider one service master per worker node.
127127
For D14 v2 worker node, HDI 4.0 - the recommended value is (85 GB - 4 GB - 1 GB)) = **80 GB**
128-
(For HDI 3.6, recommended value is **79 GB** because you should reserve additional ~2 GB for slider AM.)
128+
129129

130130
#### **8. Determining number of executors per LLAP daemon**
131131
Configuration: ***hive.llap.daemon.num.executors***, ***hive.llap.io.threadpool.size***
132132

133133
***hive.llap.daemon.num.executors***:
134134
This configuration controls the number of executors that can execute tasks in parallel per LLAP daemon. This value depends on the number of vcores, the amount of memory used per executor, and the amount of total memory available for LLAP daemon container. The number of executors can be oversubscribed to 120% of available vcores per worker node. However, it should be adjusted if it doesn't meet the memory requirements based on memory needed per executor and the LLAP daemon container size.
135135

136-
Each executor is equivalent to a Tez container and can consume 4GB(Tez container size) of memory. All executors in LLAP daemon share the same heap memory. With the assumption that not all executors run memory intensive operations at the same time, you can consider 75% of Tez container size(4 GB) per executor. This way you can increase the number of executors by giving each executor less memory (e.g. 3 GB) for increased parallelism. However, it is recommended to tune this setting for your target workload.
136+
Each executor is equivalent to a Tez container and can consume 4 GB(Tez container size) of memory. All executors in LLAP daemon share the same heap memory. With the assumption that not all executors run memory intensive operations at the same time, you can consider 75% of Tez container size(4 GB) per executor. This way you can increase the number of executors by giving each executor less memory (for example, 3 GB) for increased parallelism. However, it is recommended to tune this setting for your target workload.
137137

138138
There are 16 vcores on D14 v2 VMs.
139-
For D14 v2, the recommended value for num of executors is (16 vcores x 120%) ~= **19** on each worker node considering 3GB per executor.
139+
For D14 v2, the recommended value for num of executors is (16 vcores x 120%) ~= **19** on each worker node considering 3 GB per executor.
140140

141141
***hive.llap.io.threadpool.size***:
142142
This value specifies the thread pool size for executors. Since executors are fixed as specified, it will be same as number of executors per LLAP daemon.
@@ -171,21 +171,21 @@ Setting *hive.llap.io.allocator.mmap* = true will enable SSD caching.
171171
When SSD cache is enabled, some portion of the memory will be used to store metadata for the SSD cache. The metadata is stored in memory and it's expected to be ~8% of SSD cache size.
172172
SSD Cache in-memory metadata size = LLAP daemon container size - (Head room + Heap size)
173173
For D14 v2, with HDI 4.0, SSD cache in-memory metadata size = 80 GB - (4 GB + 57 GB) = **19 GB**
174-
For D14 v2, with HDI 3.6, SSD cache in-memory metadata size = 79 GB - (4 GB + 57 GB) = **18 GB**
174+
175175

176176
Given the size of available memory for storing SSD cache metadata, we can calculate the size of SSD cache that can be supported.
177177
Size of in-memory metadata for SSD cache = LLAP daemon container size - (Head room + Heap size)
178178
= 19 GB
179179
Size of SSD cache = size of in-memory metadata for SSD cache(19 GB) / 0.08 (8 percent)
180180

181181
For D14 v2 and HDI 4.0, the recommended SSD cache size = 19 GB / 0.08 ~= **237 GB**
182-
For D14 v2 and HDI 3.6, the recommended SSD cache size = 18 GB / 0.08 ~= **225 GB**
182+
183183

184184
#### **10. Adjusting Map Join memory**
185185
Configuration: ***hive.auto.convert.join.noconditionaltask.size***
186186

187187
Make sure you have *hive.auto.convert.join.noconditionaltask* enabled for this parameter to take effect.
188-
This configuration determine the threshold for MapJoin selection by Hive optimizer that considers oversubscription of memory from other executors to have more room for in-memory hash tables to allow more map join conversions. Considering 3GB per executor, this size can be oversubscribed to 3GB, but some heap memory may also be used for sort buffers, shuffle buffers, etc. by the other operations.
188+
This configuration determines the threshold for MapJoin selection by Hive optimizer that considers oversubscription of memory from other executors to have more room for in-memory hash tables to allow more map join conversions. Considering 3 GB per executor, this size can be oversubscribed to 3 GB, but some heap memory may also be used for sort buffers, shuffle buffers, etc. by the other operations.
189189
So for D14 v2, with 3 GB memory per executor, it's recommended to set this value to **2048 MB**.
190190

191191
(Note: This value may need adjustments that are suitable for your workload. Setting this value too low may not use autoconvert feature. And setting it too high may result into out of memory exceptions or GC pauses that can result into adverse performance.)
@@ -196,7 +196,6 @@ Ambari environment variables: ***num_llap_nodes, num_llap_nodes_for_llap_daemons
196196
**num_llap_nodes** - specifies number of nodes used by Hive LLAP service, this includes nodes running LLAP daemon, LLAP Service Master, and Tez Application Master(Tez AM).
197197

198198
:::image type="content" source="./media/hive-llap-sizing-guide/LLAP_sizing_guide_num_llap_nodes.png " alt-text="`Number of Nodes for LLAP service`" border="true":::
199-
200199
**num_llap_nodes_for_llap_daemons** - specified number of nodes used only for LLAP daemons. LLAP daemon container sizes are set to max fit node, so it will result in one llap daemon on each node.
201200

202201
:::image type="content" source="./media/hive-llap-sizing-guide/LLAP_sizing_guide_num_llap_nodes_for_llap_daemons.png " alt-text="`Number of Nodes for LLAP daemons`" border="true":::
@@ -206,19 +205,20 @@ It's recommended to keep both values same as number of worker nodes in Interacti
206205
### **Considerations for Workload Management**
207206
If you want to enable workload management for LLAP, make sure you reserve enough capacity for workload management to function as expected. The workload management requires configuration of a custom YARN queue, which is in addition to `llap` queue. Make sure you divide total cluster resource capacity between llap queue and workload management queue in accordance to your workload requirements.
208207
Workload management spawns Tez Application Masters(Tez AMs) when a resource plan is activated.
209-
Please note:
208+
209+
**Note:**
210210

211211
* Tez AMs spawned by activating a resource plan consume resources from the workload management queue as specified by `hive.server2.tez.interactive.queue`.
212212
* The number of Tez AMs would depend on the value of `QUERY_PARALLELISM` specified in the resource plan.
213-
* Once the workload management is active, Tez AMs in llap queue will not used. Only Tez AMs from workload management queue are used for query coordination. Tez AMs in the `llap` queue are used when workload management is disabled.
213+
* Once the workload management is active, Tez AMs in LLAP queue will not be used. Only Tez AMs from workload management queue are used for query coordination. Tez AMs in the `llap` queue are used when workload management is disabled.
214214

215215
For example:
216-
Total cluster capacity = 100 GB memory, divided between LLAP, Workload Management, and Default queues as follows:
217-
- llap queue capacity = 70 GB
216+
Total cluster capacity = 100-GB memory, divided between LLAP, Workload Management, and Default queues as follows:
217+
- LLAP queue capacity = 70 GB
218218
- Workload management queue capacity = 20 GB
219219
- Default queue capacity = 10 GB
220220

221-
With 20 GB in workload management queue capacity, a resource plan can specify `QUERY_PARALLELISM` value as five, which means workload management can launch five Tez AMs with 4 GB container size each. If `QUERY_PARALLELISM` is higher than the capacity, you may see some Tez AMs stop responding in `ACCEPTED` state. The Hiveserver2 Interactive cannot submit query fragments to the Tez AMs that are not in `RUNNING` state.
221+
With 20 GB in workload management queue capacity, a resource plan can specify `QUERY_PARALLELISM` value as five, which means workload management can launch five Tez AMs with 4 GB container size each. If `QUERY_PARALLELISM` is higher then the capacity, you may see some Tez AMs stop responding in `ACCEPTED` state. The Hiveserver2 Interactive cannot submit query fragments to the Tez AMs that are not in `RUNNING` state.
222222

223223

224224
#### **Next Steps**
@@ -228,7 +228,7 @@ If setting these values didn't resolve your issue, visit one of the following...
228228

229229
* Connect with [@AzureSupport](https://twitter.com/azuresupport) - the official Microsoft Azure account for improving customer experience by connecting the Azure community to the right resources: answers, support, and experts.
230230

231-
* If you need more help, you can submit a support request from the [Azure portal](https://portal.azure.com/?#blade/Microsoft_Azure_Support/HelpAndSupportBlade/). Select **Support** from the menu bar or open the **Help + support** hub. For more detailed information, please review [How to create an Azure support request](../../azure-portal/supportability/how-to-create-azure-support-request.md). Access to Subscription Management and billing support is included with your Microsoft Azure subscription, and Technical Support is provided through one of the [Azure Support Plans](https://azure.microsoft.com/support/plans/).
231+
* If you need more help, you can submit a support request from the [Azure portal](https://portal.azure.com/?#blade/Microsoft_Azure_Support/HelpAndSupportBlade/). Select **Support** from the menu bar or open the **Help + support** hub. For more detailed information, review [How to create an Azure support request](../../azure-portal/supportability/how-to-create-azure-support-request.md). Access to Subscription Management and billing support is included with your Microsoft Azure subscription, and Technical Support is provided through one of the [Azure Support Plans](https://azure.microsoft.com/support/plans/).
232232

233233
* ##### **Other References:**
234234
* [Configure other LLAP properties](https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/performance-tuning/content/hive_setup_llap.html)

0 commit comments

Comments
 (0)