You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-hadoop-provision-linux-clusters.md
+11-13Lines changed: 11 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: Set up clusters in HDInsight with Apache Hadoop, Apache Spark, Apache Kafka, and more
3
-
description: Set up Hadoop, Kafka, Spark, HBase, or Storm clusters for HDInsight from a browser, the Azure classic CLI, Azure PowerShell, REST, or SDK.
3
+
description: Set up Hadoop, Kafka, Spark, or HBase clusters for HDInsight from a browser, the Azure classic CLI, Azure PowerShell, REST, or SDK.
Learn how to set up and configure Apache Hadoop, Apache Spark, Apache Kafka, Interactive Query, Apache HBase, or Apache Storm in HDInsight. Also, learn how to customize clusters and add security by joining them to a domain.
14
+
Learn how to set up and configure Apache Hadoop, Apache Spark, Apache Kafka, Interactive Query, or Apache HBase or in HDInsight. Also, learn how to customize clusters and add security by joining them to a domain.
15
15
16
16
A Hadoop cluster consists of several virtual machines (nodes) that are used for distributed processing of tasks. Azure HDInsight handles implementation details of installation and configuration of individual nodes, so you only have to provide general configuration information.
17
17
@@ -64,7 +64,7 @@ You don't need to specify the cluster location explicitly: The cluster is in the
64
64
Azure HDInsight currently provides the following cluster types, each with a set of components to provide certain functionalities.
65
65
66
66
> [!IMPORTANT]
67
-
> HDInsight clusters are available in various types, each for a single workload or technology. There is no supported method to create a cluster that combines multiple types, such as Storm and HBase on one cluster. If your solution requires technologies that are spread across multiple HDInsight cluster types, an [Azure virtual network](../virtual-network/index.yml) can connect the required cluster types.
67
+
> HDInsight clusters are available in various types, each for a single workload or technology. There is no supported method to create a cluster that combines multiple types, such HBase on one cluster. If your solution requires technologies that are spread across multiple HDInsight cluster types, an [Azure virtual network](../virtual-network/index.yml) can connect the required cluster types.
68
68
69
69
| Cluster type | Functionality |
70
70
| --- | --- |
@@ -73,7 +73,6 @@ Azure HDInsight currently provides the following cluster types, each with a set
73
73
|[Interactive Query](./interactive-query/apache-interactive-query-get-started.md)|In-memory caching for interactive and faster Hive queries |
74
74
|[Kafka](kafka/apache-kafka-introduction.md)| A distributed streaming platform that can be used to build real-time streaming data pipelines and applications |
@@ -244,7 +243,7 @@ For more information, see [Sizes for virtual machines](../virtual-machines/sizes
244
243
245
244
### Disk attachment
246
245
247
-
On each of the **NodeManager** machines, **LocalResources** are ultimately localized in the following target directories.
246
+
On each of the **NodeManager** machines, **LocalResources** are ultimately localized in the target directories.
248
247
249
248
By normal configuration only the default disk is added as the local disk in NodeManager. For large applications this disk space may not be enough which can result in job failure.
250
249
@@ -254,7 +253,7 @@ You can add number of disks per VM and each disk will be of 1 TB size.
254
253
255
254
1. From **Configuration + pricing** tab
256
255
1. Select **Enable managed disk** option
257
-
1. From **Standard disks**, Enter the **Numbet of disks**
256
+
1. From **Standard disks**, Enter the **Number of disks**
258
257
1. Choose your **Worker node**
259
258
260
259
You can verify the number of disks from **Review + create** tab, under **Cluster configuration**
@@ -287,12 +286,11 @@ Sometimes, you want to configure the following configuration files during the cr
287
286
* hive-env.xml
288
287
* hive-site.xml
289
288
* mapred-site
290
-
* oozie-site.xml
291
-
* oozie-env.xml
292
-
* storm-site.xml
293
-
* tez-site.xml
294
-
* webhcat-site.xml
295
-
* yarn-site.xml
289
+
* oozie-site.xml
290
+
* oozie-env.xml
291
+
* tez-site.xml
292
+
* webhcat-site.xml
293
+
* yarn-site.xml
296
294
297
295
For more information, see [Customize HDInsight clusters using Bootstrap](hdinsight-hadoop-customize-cluster-bootstrap.md).
0 commit comments