You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: In Azure HDInsight, few points to be considered before starting to create a cluster.
4
+
ms.service: hdinsight
5
+
ms.topic: conceptual
6
+
ms.date: 09/22/2022
7
+
---
8
+
9
+
# Consider the below points before starting to create a cluster.
10
+
11
+
As part of the best practices, consider the following points before starting to create a cluster.
12
+
13
+
## Bring your own database
14
+
15
+
HDInsight have two options to configure the databases in the clusters.
16
+
17
+
1. Bring your own database (external)
18
+
1. Default database (internal)
19
+
20
+
During cluster creation, default configuration will use internal database. Once the cluster is created, customer can’t change the database type. Hence, it's recommended to create and use the external database. You can create custom databases for Ambari, Hive, and Ranger.
21
+
22
+
For more information, see how to [Set up HDInsight clusters with a custom Ambari DB](/azure/hdinsight/hdinsight-custom-ambari-db.md)
23
+
24
+
## Keep your clusters up to date
25
+
26
+
To take advantage of the latest HDInsight features, we recommend regularly migrating your HDInsight clusters to the latest version. HDInsight doesn't support in-place upgrades where existing clusters are upgraded to new component versions. You need to create a new cluster with the desired components and platform version and migrate your application to use the new cluster.
27
+
28
+
As part of the best practices, we recommend you keep your clusters updated on regular basis.
29
+
30
+
HDInsight release happens every 30 to 60 days. It's always good to move to the latest release as early possible. The recommended maximum duration for cluster upgrades is less than six months.
31
+
32
+
For more information, see how to [Migrate HDInsight cluster to a newer version](/azure/hdinsight/hdinsight-upgrade-cluster.md)
33
+
34
+
## Next steps
35
+
36
+
*[Create Apache Hadoop cluster in HDInsight](./hadoop/apache-hadoop-linux-create-cluster-get-started-portal.md)
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-overview.md
+4-6Lines changed: 4 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,13 +4,13 @@ description: An introduction to HDInsight, and the Apache Hadoop and Apache Spar
4
4
ms.service: hdinsight
5
5
ms.topic: overview
6
6
ms.custom: contperf-fy21q1
7
-
ms.date: 07/28/2022
7
+
ms.date: 09/20/2022
8
8
#Customer intent: As a data analyst, I want understand what is Hadoop and how it is offered in Azure HDInsight so that I can decide on using HDInsight instead of on premises clusters.
9
9
---
10
10
11
11
# What is Azure HDInsight?
12
12
13
-
Azure HDInsight is a managed, full-spectrum, open-source analytics service in the cloud for enterprises. With HDInsight, you can use open-source frameworks such as Hadoop, Apache Spark, Apache Hive, LLAP, Apache Kafka, and more, in your Azure environment.
13
+
Azure HDInsight is a managed, full-spectrum, open-source analytics service in the cloud for enterprises. With HDInsight, you can use open-source frameworks such as, Apache Spark, Apache Hive, LLAP, Apache Kafka, Hadoop and more, in your Azure environment.
14
14
15
15
## What is HDInsight and the Hadoop technology stack?
16
16
@@ -109,13 +109,11 @@ Familiar business intelligence (BI) tools retrieve, analyze, and report data tha
109
109
110
110
*[Connect Excel to Apache Hadoop with the Microsoft Hive ODBC Driver](./hadoop/apache-hadoop-connect-excel-hive-odbc-driver.md) (requires Windows)
111
111
112
-
113
112
## In-region data residency
114
113
115
-
Spark, Hadoop, and LLAP don't store customer data, so these services automatically satisfy in-region data residency requirements including those specified in the [Trust Center](https://azuredatacentermap.azurewebsites.net/).
116
-
117
-
Kafka and HBase do store customer data. This data is automatically stored by Kafka and HBase in a single region, so this service satisfies in-region data residency requirements including those specified in the [Trust Center](https://azuredatacentermap.azurewebsites.net/).
114
+
Spark, Hadoop, and LLAP don't store customer data, so these services automatically satisfy in-region data residency requirements specified in the [Trust Center](https://azuredatacentermap.azurewebsites.net/).
118
115
116
+
Kafka and HBase do store customer data. This data is automatically stored by Kafka and HBase in a single region, so this service satisfies in-region data residency requirements specified in the [Trust Center](https://azuredatacentermap.azurewebsites.net/).
119
117
120
118
Familiar business intelligence (BI) tools retrieve, analyze, and report data that is integrated with HDInsight by using either the Power Query add-in or the Microsoft Hive ODBC Driver.
0 commit comments