You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-changing-configs-via-ambari.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,13 +6,13 @@ ms.author: hrasheed
6
6
ms.reviewer: jasonh
7
7
ms.service: hdinsight
8
8
ms.topic: conceptual
9
-
ms.custom: hdinsightactive
10
-
ms.date: 04/16/2020
9
+
ms.custom: hdinsightactive,seoapr2020
10
+
ms.date: 04/28/2020
11
11
---
12
12
13
13
# Use Apache Ambari to optimize HDInsight cluster configurations
14
14
15
-
HDInsight provides [Apache Hadoop](./hadoop/apache-hadoop-introduction.md) clusters for large-scale data processing applications. Managing, monitoring, and optimizing these complex multi-node clusters can be challenging. [Apache Ambari](https://ambari.apache.org/) is a web interface to manage and monitor HDInsight Linux clusters. For Windows clusters, use the [Ambari REST API](hdinsight-hadoop-manage-ambari-rest-api.md).
15
+
HDInsight provides Apache Hadoop clusters for large-scale data processing applications. Managing, monitoring, and optimizing these complex multi-node clusters can be challenging. Apache Ambari is a web interface to manage and monitor HDInsight Linux clusters. For Windows clusters, use the [Ambari REST API](hdinsight-hadoop-manage-ambari-rest-api.md).
16
16
17
17
For an introduction to using the Ambari Web UI, see [Manage HDInsight clusters by using the Apache Ambari Web UI](hdinsight-hadoop-manage-ambari.md)
18
18
@@ -63,7 +63,7 @@ The following sections describe configuration options for optimizing overall Apa
63
63
64
64
### Set the Hive execution engine
65
65
66
-
Hive provides two execution engines: [Apache Hadoop MapReduce](https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html)and [Apache TEZ](https://tez.apache.org/). Tez is faster than MapReduce. HDInsight Linux clusters have Tez as the default execution engine. To change the execution engine:
66
+
Hive provides two execution engines: Apache Hadoop MapReduceand Apache TEZ. Tez is faster than MapReduce. HDInsight Linux clusters have Tez as the default execution engine. To change the execution engine:
67
67
68
68
1. In the Hive **Configs** tab, type **execution engine** in the filter box.
69
69
@@ -94,7 +94,7 @@ These changes affect all Tez jobs across the server. To get an optimal result,
94
94
95
95
### Tune reducers
96
96
97
-
[Apache ORC](https://orc.apache.org/) and [Snappy](https://google.github.io/snappy/) both offer high performance. However, Hive may have too few reducers by default, causing bottlenecks.
97
+
Apache ORC and Snappy both offer high performance. However, Hive may have too few reducers by default, causing bottlenecks.
98
98
99
99
For example, say you have an input data size of 50 GB. That data in ORC format with Snappy compression is 1 GB. Hive estimates the number of reducers needed as: (number of bytes input to mappers / `hive.exec.reducers.bytes.per.reducer`).
100
100
@@ -281,7 +281,7 @@ Additional recommendations for optimizing the Hive execution engine:
281
281
282
282
## Apache Pig optimization
283
283
284
-
[Apache Pig](https://pig.apache.org/) properties can be modified from the Ambari web UI to tune Pig queries. Modifying Pig properties from Ambari directly modifies the Pig properties in the `/etc/pig/2.4.2.0-258.0/pig.properties` file.
284
+
Apache Pig properties can be modified from the Ambari web UI to tune Pig queries. Modifying Pig properties from Ambari directly modifies the Pig properties in the `/etc/pig/2.4.2.0-258.0/pig.properties` file.
285
285
286
286
1. To modify Pig properties, navigate to the Pig **Configs** tab, and then expand the **Advanced pig-properties** pane.
287
287
@@ -332,7 +332,7 @@ Pig generates temporary files during job execution. Compressing the temporary fi
332
332
333
333
*`pig.tmpfilecompression`: When true, enables temporary file compression. The default value is false.
334
334
335
-
*`pig.tmpfilecompression.codec`: The compression codec to use for compressing the temporary files. The recommended compression codecs are [LZO](https://www.oberhumer.com/opensource/lzo/) and Snappy for lower CPU use.
335
+
*`pig.tmpfilecompression.codec`: The compression codec to use for compressing the temporary files. The recommended compression codecs are LZO and Snappy for lower CPU use.
336
336
337
337
### Enable split combining
338
338
@@ -348,7 +348,7 @@ The number of reducers is calculated based on the parameter `pig.exec.reducers.b
348
348
349
349
## Apache HBase optimization with the Ambari web UI
350
350
351
-
[Apache HBase](https://hbase.apache.org/) configuration is modified from the **HBase Configs** tab. The following sections describe some of the important configuration settings that affect HBase performance.
351
+
Apache HBase configuration is modified from the **HBase Configs** tab. The following sections describe some of the important configuration settings that affect HBase performance.
0 commit comments