Skip to content

Commit 70e992e

Browse files
Merge pull request #90450 from hrasheed-msft/hdi_custom_ambaridb
HdInsight custom ambaridb
2 parents d39a632 + 59ddb84 commit 70e992e

File tree

4 files changed

+75
-2
lines changed

4 files changed

+75
-2
lines changed

articles/hdinsight/TOC.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,9 @@
136136
- name: Autoscale clusters
137137
href: ./hdinsight-autoscale-clusters.md
138138
- name: Use external metadata stores
139-
href: ./hdinsight-use-external-metadata-stores.md
139+
href: ./hdinsight-use-external-metadata-stores.md
140+
- name: Custom Ambari DB
141+
href: ./hdinsight-custom-ambari-db.md
140142
- name: Manage logs for an HDInsight cluster
141143
href: ./hdinsight-log-management.md
142144
- name: Add storage accounts
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
---
2+
title: Custom Apache Ambari database on Azure HDInsight
3+
description: Learn how to create HDInsight clusters with your own custom Apache Ambari database.
4+
author: hrasheed-msft
5+
ms.reviewer: jasonh
6+
ms.service: hdinsight
7+
ms.custom: hdinsightactive
8+
ms.topic: conceptual
9+
ms.date: 10/29/2019
10+
ms.author: hrasheed
11+
---
12+
# Set up HDInsight clusters with a custom Ambari DB
13+
14+
Apache Ambari simplifies the management and monitoring of an Apache Hadoop cluster. Ambari provides an easy to use web UI and REST API. Ambari is included on HDInsight clusters, and is used to monitor the cluster and make configuration changes.
15+
16+
In normal cluster creation, as described in other articles such as [Set up clusters in HDInsight](hdinsight-hadoop-provision-linux-clusters.md), Ambari is deployed in an [S0 Azure SQL database](../sql-database/sql-database-dtu-resource-limits-single-databases.md#standard-service-tier) that is managed by HDInsight and is not accessible to users.
17+
18+
The custom Ambari DB feature allows you to deploy a new cluster and setup Ambari in an external database that you manage. The deployment is done with an Azure Resource Manager template. This feature has the following benefits:
19+
20+
- Customization - you choose the size and processing capacity of the database. If you have large clusters processing intensive workloads, an Ambari database with lower specifications could become a bottleneck for management operations.
21+
- Flexibility - you can scale the database as needed to suit your requirements.
22+
- Control - you can manage backups and security for your database in a way that fits with your organizations requirements.
23+
24+
The remainder of this article discusses the following points:
25+
26+
- requirements to use the custom Ambari DB feature
27+
- the steps necessary to provision HDInsight clusters using your own external database for Apache Ambari
28+
29+
## Custom Ambari DB requirements
30+
31+
You can deploy a custom Ambari DB with all cluster types and versions. Multiple clusters cannot use the same Ambari DB.
32+
33+
The custom Ambari DB has the following other requirements:
34+
35+
- You must have an existing Azure SQL DB server and database.
36+
- The database that you provide for Ambari setup must be empty. There should be no tables in the default dbo schema.
37+
- The user used to connect to the database should have SELECT, CREATE TABLE, and INSERT permissions on the database.
38+
- Turn on the option to [Allow access to Azure services](../sql-database/sql-database-vnet-service-endpoint-rule-overview.md#azure-portal-steps) on the Azure SQL server where you will host Ambari.
39+
- Management IP addresses from HDInsight service need to be allowed in the SQL Server. See [HDInsight management IP addresses](hdinsight-management-ip-addresses.md) for a list of the IP addresses that must be added to the SQL server firewall.
40+
41+
When you host your Apache Ambari DB in an external database, remember the following points:
42+
43+
- You're responsible for the additional costs of the Azure SQL DB that holds Ambari.
44+
- Back up your custom Ambari DB periodically. Azure SQL Database generates backups automatically, but the backup retention time-frame varies. For more information, see [Learn about automatic SQL Database backups](../sql-database/sql-database-automated-backups.md).
45+
46+
## Deploy clusters with a custom Ambari DB
47+
48+
To create an HDInsight cluster that uses your own external Ambari database, use the [custom Ambari DB Quickstart template](https://github.com/Azure/azure-quickstart-templates/tree/master/101-hdinsight-custom-ambari-db).
49+
50+
Edit the parameters in the `azuredeploy.parameters.json` to specify information about your new cluster and the database that will hold Ambari.
51+
52+
You can begin the deployment using the Azure CLI. Replace `<RESOURCEGROUPNAME>` with the resource group where you want to deploy your cluster.
53+
54+
```azure-cli
55+
az group deployment create --name HDInsightAmbariDBDeployment \
56+
--resource-group <RESOURCEGROUPNAME> \
57+
--template-file azuredeploy.json \
58+
--parameters azuredeploy.parameters.json
59+
```
60+
61+
## Next steps
62+
63+
- [Use external metadata stores in Azure HDInsight](hdinsight-use-external-metadata-stores.md)

articles/hdinsight/hdinsight-use-external-metadata-stores.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,13 @@ ms.reviewer: jasonh
77
ms.service: hdinsight
88
ms.custom: hdinsightactive
99
ms.topic: conceptual
10-
ms.date: 10/17/2019
10+
ms.date: 10/29/2019
1111
---
1212

1313
# Use external metadata stores in Azure HDInsight
1414

15+
HDInsight allows you to take control of your data and metadata by deploying key metadata solutions and management databases to external data stores. This feature is currently available for [Apache Hive metastore](#custom-metastore), [Apache Oozie metastore](#apache-oozie-metastore) and [Apache Ambari database](#custom-ambari-db).
16+
1517
The Apache Hive metastore in HDInsight is an essential part of the Apache Hadoop architecture. A metastore is the central schema repository that can be used by other big data access tools such as Apache Spark, Interactive Query (LLAP), Presto, or Apache Pig. HDInsight uses an Azure SQL Database as the Hive metastore.
1618

1719
![HDInsight Hive Metadata Store Architecture](./media/hdinsight-use-external-metadata-stores/metadata-store-architecture.png)
@@ -88,6 +90,10 @@ Apache Oozie is a workflow coordination system that manages Hadoop jobs. Oozie
8890

8991
For instructions on creating an Oozie metastore with Azure SQL Database, see [Use Apache Oozie for workflows](hdinsight-use-oozie-linux-mac.md).
9092

93+
## Custom Ambari DB
94+
95+
To use your own external database with Apache Ambari on HDInsight, see [Custom Apache Ambari database](hdinsight-custom-ambari-db.md).
96+
9197
## Next steps
9298

9399
* [Set up clusters in HDInsight with Apache Hadoop, Apache Spark, Apache Kafka, and more](./hdinsight-hadoop-provision-linux-clusters.md)

articles/hdinsight/index.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,8 @@ landingContent:
4242
url: ./hdinsight-hadoop-use-data-lake-storage-gen2.md
4343
- linkListType: whats-new
4444
links:
45+
- text: Custom Ambari DB
46+
url: ./hdinsight-custom-ambari-db.md
4547
- text: 'Autoscale: automatically scale hadoop clusters'
4648
url: ./hdinsight-autoscale-clusters.md
4749
- text: Monitoring with HDInsight

0 commit comments

Comments
 (0)