Skip to content

Commit 492b0f1

Browse files
authored
Merge pull request #211836 from sreekzz/patch-104
Added 'Before you start page'
2 parents b017640 + d297fd9 commit 492b0f1

File tree

3 files changed

+44
-6
lines changed

3 files changed

+44
-6
lines changed

articles/hdinsight/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ items:
88
items:
99
- name: What is Azure HDInsight?
1010
href: ./hdinsight-overview.md
11+
- name: Before you start
12+
href: ./hdinsight-overview-before-you-start.md
1113
- name: Tutorials
1214
expanded: true
1315
items:
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
title: Before you start with Azure HDInsight
3+
description: In Azure HDInsight, few points to be considered before starting to create a cluster.
4+
ms.service: hdinsight
5+
ms.topic: conceptual
6+
ms.date: 09/22/2022
7+
---
8+
9+
# Consider the below points before starting to create a cluster.
10+
11+
As part of the best practices, consider the following points before starting to create a cluster.
12+
13+
## Bring your own database
14+
15+
HDInsight have two options to configure the databases in the clusters.
16+
17+
1. Bring your own database (external)
18+
1. Default database (internal)
19+
20+
During cluster creation, default configuration will use internal database. Once the cluster is created, customer can’t change the database type. Hence, it's recommended to create and use the external database. You can create custom databases for Ambari, Hive, and Ranger.
21+
22+
For more information, see how to [Set up HDInsight clusters with a custom Ambari DB](/azure/hdinsight/hdinsight-custom-ambari-db.md)
23+
24+
## Keep your clusters up to date
25+
26+
To take advantage of the latest HDInsight features, we recommend regularly migrating your HDInsight clusters to the latest version. HDInsight doesn't support in-place upgrades where existing clusters are upgraded to new component versions. You need to create a new cluster with the desired components and platform version and migrate your application to use the new cluster.
27+
28+
As part of the best practices, we recommend you keep your clusters updated on regular basis.
29+
30+
HDInsight release happens every 30 to 60 days. It's always good to move to the latest release as early possible. The recommended maximum duration for cluster upgrades is less than six months.
31+
32+
For more information, see how to [Migrate HDInsight cluster to a newer version](/azure/hdinsight/hdinsight-upgrade-cluster.md)
33+
34+
## Next steps
35+
36+
* [Create Apache Hadoop cluster in HDInsight](./hadoop/apache-hadoop-linux-create-cluster-get-started-portal.md)
37+
* [Create Apache Spark cluster - Portal](./spark/apache-spark-jupyter-spark-sql-use-portal.md)
38+
* [Enterprise security in Azure HDInsight](./domain-joined/hdinsight-security-overview.md)

articles/hdinsight/hdinsight-overview.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@ description: An introduction to HDInsight, and the Apache Hadoop and Apache Spar
44
ms.service: hdinsight
55
ms.topic: overview
66
ms.custom: contperf-fy21q1
7-
ms.date: 07/28/2022
7+
ms.date: 09/20/2022
88
#Customer intent: As a data analyst, I want understand what is Hadoop and how it is offered in Azure HDInsight so that I can decide on using HDInsight instead of on premises clusters.
99
---
1010

1111
# What is Azure HDInsight?
1212

13-
Azure HDInsight is a managed, full-spectrum, open-source analytics service in the cloud for enterprises. With HDInsight, you can use open-source frameworks such as Hadoop, Apache Spark, Apache Hive, LLAP, Apache Kafka, and more, in your Azure environment.
13+
Azure HDInsight is a managed, full-spectrum, open-source analytics service in the cloud for enterprises. With HDInsight, you can use open-source frameworks such as, Apache Spark, Apache Hive, LLAP, Apache Kafka, Hadoop and more, in your Azure environment.
1414

1515
## What is HDInsight and the Hadoop technology stack?
1616

@@ -109,13 +109,11 @@ Familiar business intelligence (BI) tools retrieve, analyze, and report data tha
109109

110110
* [Connect Excel to Apache Hadoop with the Microsoft Hive ODBC Driver](./hadoop/apache-hadoop-connect-excel-hive-odbc-driver.md) (requires Windows)
111111

112-
113112
## In-region data residency
114113

115-
Spark, Hadoop, and LLAP don't store customer data, so these services automatically satisfy in-region data residency requirements including those specified in the [Trust Center](https://azuredatacentermap.azurewebsites.net/).
116-
117-
Kafka and HBase do store customer data. This data is automatically stored by Kafka and HBase in a single region, so this service satisfies in-region data residency requirements including those specified in the [Trust Center](https://azuredatacentermap.azurewebsites.net/).
114+
Spark, Hadoop, and LLAP don't store customer data, so these services automatically satisfy in-region data residency requirements specified in the [Trust Center](https://azuredatacentermap.azurewebsites.net/).
118115

116+
Kafka and HBase do store customer data. This data is automatically stored by Kafka and HBase in a single region, so this service satisfies in-region data residency requirements specified in the [Trust Center](https://azuredatacentermap.azurewebsites.net/).
119117

120118
Familiar business intelligence (BI) tools retrieve, analyze, and report data that is integrated with HDInsight by using either the Power Query add-in or the Microsoft Hive ODBC Driver.
121119

0 commit comments

Comments
 (0)