Skip to content

Commit 6be318b

Browse files
authored
Update hdinsight-faq.md
Added best practices for creating large HDInsight clusters
1 parent 722fd57 commit 6be318b

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

articles/hdinsight/hdinsight-faq.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,14 @@ For more information, see [Capacity planning for HDInsight clusters](https://doc
3939

4040
See [Resource types in Azure HDInsight clusters](hdinsight-virtual-network-architecture.md#resource-types-in-azure-hdinsight-clusters).
4141

42+
### What are the best practices for creating large HDInsight clusters?
43+
44+
1. Recommend setting up HDInsight clusters with a [Custom Ambari DB](https://docs.microsoft.com/azure/hdinsight/hdinsight-custom-ambari-db) to improve the cluster scalability.
45+
2. Use [Azure Data Lake Storage Gen2](https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-use-data-lake-storage-gen2) to create HDInsight clusters to take advantage of higher bandwidth and other performance characteristics of Azure Data Lake Storage Gen2.
46+
3. Headnodes should be sufficiently large to accommodate multiple master services running on these nodes.
47+
4. Some specific workloads such as Interactive Query will also need larger Zookeeper nodes. Please consider minimum of 8 core VMs.
48+
5. In the case of Hive and Spark, use [External Hive metastore](https://docs.microsoft.com/azure/hdinsight/hdinsight-use-external-metadata-stores).
49+
4250
## Individual Components
4351

4452
### Can I install additional components on my cluster?

0 commit comments

Comments
 (0)