Skip to content

Commit a3c790e

Browse files
authored
Merge pull request #107319 from dagiro/freshness25
freshness25
2 parents 55adc68 + e78f608 commit a3c790e

9 files changed

+28
-30
lines changed

articles/hdinsight/hadoop/apache-hadoop-connect-hive-power-bi.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ The information also applies to the new [Interactive Query](../interactive-query
2727

2828
Before going through this article, you must have the following items:
2929

30-
* HDInsight cluster. The cluster can be either a HDInsight cluster with Hive or a newly released Interactive Query cluster. For creating clusters, see [Create cluster](apache-hadoop-linux-tutorial-get-started.md#create-cluster).
30+
* HDInsight cluster. The cluster can be either a HDInsight cluster with Hive or a newly released Interactive Query cluster. For creating clusters, see [Create cluster](apache-hadoop-linux-tutorial-get-started.md).
3131

3232
* [Microsoft Power BI Desktop](https://powerbi.microsoft.com/desktop/). You can download a copy from the [Microsoft Download Center](https://www.microsoft.com/download/details.aspx?id=45331).
3333

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,13 @@
11
---
22
title: 'Quickstart: Apache Hadoop & Resource Manager - Azure HDInsight'
33
description: In this quickstart, you create Apache Hadoop cluster in Azure HDInsight using Resource Manager template
4-
keywords: hadoop getting started,hadoop linux,hadoop quickstart,hive getting started,hive quickstart
5-
ms.service: hdinsight
64
author: hrasheed-msft
75
ms.author: hrasheed
86
ms.reviewer: jasonh
9-
ms.custom: hdinsightactive,hdiseo17may2017,mvc,seodec18
7+
ms.service: hdinsight
108
ms.topic: quickstart
11-
ms.date: 06/12/2019
9+
ms.custom: hdinsightactive,hdiseo17may2017,mvc,seodec18
10+
ms.date: 03/11/2020
1211
#Customer intent: As a data analyst, I need to create a Hadoop cluster in Azure HDInsight using Resource Manager template
1312
---
1413

@@ -20,26 +19,25 @@ Similar templates can be viewed at [Azure quickstart templates](https://azure.mi
2019

2120
Currently HDInsight comes with [seven different cluster types](../hdinsight-overview.md#cluster-types-in-hdinsight). Each cluster type supports a different set of components. All cluster types support Hive. For a list of supported components in HDInsight, see [What's new in the Hadoop cluster versions provided by HDInsight?](../hdinsight-component-versioning.md)
2221

23-
If you don't have an Azure subscription, [create a free account](https://azure.microsoft.com/free/) before you begin.
22+
If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
2423

25-
<a name="create-cluster"></a>
2624
## Create a Hadoop cluster
2725

2826
1. Select the **Deploy to Azure** button below to sign in to Azure and open the Resource Manager template in the Azure portal.
29-
27+
3028
<a href="https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fazure-quickstart-templates%2Fmaster%2F101-hdinsight-linux-ssh-password%2Fazuredeploy.json" target="_blank"><img src="./media/apache-hadoop-linux-tutorial-get-started/hdi-deploy-to-azure1.png" alt="Deploy to Azure button for new cluster"></a>
3129

3230
2. Enter or select the following values:
3331

3432
|Property |Description |
3533
|---------|---------|
36-
|**Subscription** | Select your Azure subscription. |
37-
|**Resource group** | Create a resource group or select an existing resource group. A resource group is a container of Azure components. In this case, the resource group contains the HDInsight cluster and the dependent Azure Storage account. |
38-
|**Location** | Select an Azure location where you want to create your cluster. Choose a location closer to you for better performance. |
39-
|**Cluster Name** | Enter a name for the Hadoop cluster. Because all clusters in HDInsight share the same DNS namespace this name needs to be unique. The name may only contain lowercase letters, numbers, and hyphens, and must begin with a letter. Each hyphen must be preceded and followed by a non-hyphen character. The name must also be between 3 and 59 characters long. |
40-
|**Cluster Type** | Select **hadoop**. |
41-
|**Cluster login name and password** | The default login name is **admin**. The password must be at least 10 characters in length and must contain at least one digit, one uppercase, and one lower case letter, one non-alphanumeric character (except characters ' " ` \). Make sure you **do not provide** common passwords such as "Pass@word1".|
42-
|**SSH username and password** | The default username is **sshuser**. You can rename the SSH username. The SSH user password has the same requirements as the cluster login password.|
34+
|Subscription| Select your Azure subscription. |
35+
|Resource group | Create a resource group or select an existing resource group. A resource group is a container of Azure components. In this case, the resource group contains the HDInsight cluster and the dependent Azure Storage account. |
36+
|Location | Select an Azure location where you want to create your cluster. Choose a location closer to you for better performance. |
37+
|Cluster Name | Enter a name for the Hadoop cluster. Because all clusters in HDInsight share the same DNS namespace this name needs to be unique. The name may only contain lowercase letters, numbers, and hyphens, and must begin with a letter. Each hyphen must be preceded and followed by a non-hyphen character. The name must also be between 3 and 59 characters long. |
38+
|Cluster Type | Select **hadoop**. |
39+
|Cluster login name and password | The default login name is **admin**. The password must be at least 10 characters in length and must contain at least one digit, one uppercase, and one lower case letter, one non-alphanumeric character (except characters ' " ` \). Make sure you **do not provide** common passwords such as "Pass@word1".|
40+
|SSH username and password| The default username is **sshuser**. You can rename the SSH username. The SSH user password has the same requirements as the cluster login password.|
4341

4442
Some properties have been hardcoded in the template. You can configure these values from the template. For more explanation of these properties, see [Create Apache Hadoop clusters in HDInsight](../hdinsight-hadoop-provision-linux-clusters.md).
4543

@@ -48,21 +46,21 @@ If you don't have an Azure subscription, [create a free account](https://azure.m
4846
4947
![HDInsight Linux gets started Resource Manager template on portal](./media/apache-hadoop-linux-tutorial-get-started/hdinsight-linux-get-started-arm-template-on-portal.png "Deploy Hadoop cluster in HDInsight using the Azure portal and a resource group manager template")
5048

51-
3. Select **I agree to the terms and conditions stated above**, and then select **Purchase**. You will receive a notification that your deployment is in progress. It takes about 20 minutes to create a cluster.
49+
3. Select **I agree to the terms and conditions stated above**, then select **Purchase**. You'll receive a notification that your deployment is in progress. It takes about 20 minutes to create a cluster.
5250

53-
4. Once the cluster is created, you will receive a **Deployment succeeded** notification with a **Go to resource group** link. Your **Resource group** page will list your new HDInsight cluster and the default storage associated with the cluster. Each cluster has an [Azure Storage account](../hdinsight-hadoop-use-blob-storage.md) or an [Azure Data Lake Storage account](../hdinsight-hadoop-use-data-lake-store.md) dependency. It is referred as the default storage account. The HDInsight cluster and its default storage account must be colocated in the same Azure region. Deleting clusters does not delete the storage account.
51+
4. Once the cluster is created, you'll receive a **Deployment succeeded** notification with a **Go to resource group** link. Your **Resource group** page will list your new HDInsight cluster and the default storage associated with the cluster. Each cluster has an [Azure Storage account](../hdinsight-hadoop-use-blob-storage.md) or an [Azure Data Lake Storage account](../hdinsight-hadoop-use-data-lake-store.md) dependency. It's referred as the default storage account. The HDInsight cluster and its default storage account must be colocated in the same Azure region. Deleting clusters doesn't delete the storage account.
5452

5553
> [!NOTE]
5654
> For other cluster creation methods and understanding the properties used in this quickstart, see [Create HDInsight clusters](../hdinsight-hadoop-provision-linux-clusters.md).
5755
5856
## Clean up resources
5957

60-
After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it is not in use. You are also charged for an HDInsight cluster, even when it is not in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.
58+
After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for an HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
6159

6260
> [!NOTE]
6361
> If you are *immediately* proceeding to the next tutorial to learn how to run ETL operations using Hadoop on HDInsight, you may want to keep the cluster running. This is because in the tutorial you have to create a Hadoop cluster again. However, if you are not going through the next tutorial right away, you must delete the cluster now.
6462
65-
**To delete the cluster and/or the default storage account**
63+
To delete the cluster and/or the default storage account:
6664

6765
1. Go back to the browser tab where you have the Azure portal. You shall be on the cluster overview page. If you only want to delete the cluster but retain the default storage account, select **Delete**.
6866

@@ -77,4 +75,4 @@ After you complete the quickstart, you may want to delete the cluster. With HDIn
7775
In this quickstart, you learned how to create an Apache Hadoop cluster in HDInsight using a Resource Manager template. In the next article, you learn how to perform an extract, transform, and load (ETL) operation using Hadoop on HDInsight.
7876

7977
> [!div class="nextstepaction"]
80-
>[Extract, transform, and load data using Interactive Query on HDInsight](../interactive-query/interactive-query-tutorial-analyze-flight-data.md)
78+
> [Extract, transform, and load data using Interactive Query on HDInsight](../interactive-query/interactive-query-tutorial-analyze-flight-data.md)

articles/hdinsight/hbase/apache-hbase-query-with-phoenix.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
2020

2121
## Prerequisites
2222

23-
* An Apache HBase cluster. See [Create cluster](../hadoop/apache-hadoop-linux-tutorial-get-started.md#create-cluster) to create an HDInsight cluster. Ensure you choose the **HBase** cluster type.
23+
* An Apache HBase cluster. See [Create cluster](../hadoop/apache-hadoop-linux-tutorial-get-started.md) to create an HDInsight cluster. Ensure you choose the **HBase** cluster type.
2424

2525
* An SSH client. For more information, see [Connect to HDInsight (Apache Hadoop) using SSH](../hdinsight-hadoop-linux-use-ssh-unix.md).
2626

articles/hdinsight/hbase/query-hbase-with-hbase-shell.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
2121

2222
## Prerequisites
2323

24-
* An Apache HBase cluster. See [Create cluster](../hadoop/apache-hadoop-linux-tutorial-get-started.md#create-cluster) to create an HDInsight cluster. Ensure you choose the **HBase** cluster type.
24+
* An Apache HBase cluster. See [Create cluster](../hadoop/apache-hadoop-linux-tutorial-get-started.md) to create an HDInsight cluster. Ensure you choose the **HBase** cluster type.
2525

2626
* An SSH client. For more information, see [Connect to HDInsight (Apache Hadoop) using SSH](../hdinsight-hadoop-linux-use-ssh-unix.md).
2727

articles/hdinsight/hdinsight-apps-install-applications.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ The following list shows the published applications:
2222
|---|---|---|
2323
|[AtScale Intelligence Platform](https://azuremarketplace.microsoft.com/marketplace/apps/atscaleinc.atscale) |Hadoop |AtScale turns your HDInsight cluster into a scale-out OLAP server, allowing you to query billions of rows of data interactively using the BI tools you already know, own, and love – from Microsoft Excel, Power BI, Tableau Software to QlikView. |
2424
|[CDAP for HDInsight](https://azuremarketplace.microsoft.com/marketplace/apps/cask.cdap-for-hdinsight) |HBase |CDAP is the first unified integration platform for big data that accelerates time to value for Hadoop and enables IT to provide self-service data. Open source and extensible, CDAP removes barriers to innovation. Requirements: 4 Region nodes, min D3 v2. |
25-
|[Datameer](https://azuremarketplace.microsoft.com/marketplace/apps/datameer.datameer) |Hadoop |Datameers self-service scalable platform for preparing, exploring, and governing your data for analytics accelerates turning complex multisource data into valuable business-ready information, delivering faster, smarter insights at an enterprise-scale. |
25+
|[Datameer](https://azuremarketplace.microsoft.com/marketplace/apps/datameer.datameer) |Hadoop |Datameer's self-service scalable platform for preparing, exploring, and governing your data for analytics accelerates turning complex multisource data into valuable business-ready information, delivering faster, smarter insights at an enterprise-scale. |
2626
|[Dataiku DSS on HDInsight](https://azuremarketplace.microsoft.com/marketplace/apps/dataiku.dss-on-hdi) |Hadoop, Spark |Dataiku DSS in an enterprise data science platform that lets data scientists and data analysts collaborate to design and run new data products and services more efficiently, turning raw data into impactful predictions. |
2727
|[WANdisco Fusion HDI App](https://azuremarketplace.microsoft.com/marketplace/apps/wandisco.fusion-hdi-app) |Hadoop, Spark,HBase,Storm,Kafka |Keeping data consistent in a distributed environment is a massive data operations challenge. WANdisco Fusion, an enterprise-class software platform, solves this problem by enabling unstructured data consistency across any environment. |
2828
|[H2O SparklingWater for HDInsight](https://azuremarketplace.microsoft.com/marketplace/apps/h2o-ai.h2o-sparklingwater) |Spark |H2O Sparkling Water supports the following distributed algorithms: GLM, Naïve Bayes, Distributed Random Forest, Gradient Boosting Machine, Deep Neural Networks, Deep learning, K-means, PCA, Generalized Low Rank Models, Anomaly Detection, Autoencoders. |
@@ -34,12 +34,12 @@ The following list shows the published applications:
3434
|[Trifacta Wrangler Enterprise](https://azuremarketplace.microsoft.com/marketplace/apps/trifacta.tr01) |Hadoop, Spark,HBase |Trifacta Wrangler Enterprise for HDInsight supports enterprise-wide data wrangling for any scale of data. The cost of running Trifacta on Azure is a combination of Trifacta subscription costs plus the Azure infrastructure costs for the virtual machines. |
3535
|[Unifi Data Platform](https://unifisoftware.com/platform/) |Hadoop,HBase,Storm,Spark |The Unifi Data Platform is a seamlessly integrated suite of self-service data tools designed to empower the business user to tackle data challenges that drive incremental revenue, reduce costs or operational complexity. |
3636
|[Unraveldata APM](https://azuremarketplace.microsoft.com/marketplace/apps/unravel-data.unravel-app) |Spark |Unravel Data app for HDInsight Spark cluster. |
37-
|[Waterline AI-Driven Data Catalog](https://azuremarketplace.microsoft.com/marketplace/apps/waterline_data.waterline_data) |Spark |Waterline catalogs, organizes, and governs data using AI to auto-tag data with business terms. Waterlines business literate catalog is a critical, success component for self-service analytics, compliance and governance, and IT management initiatives. |
37+
|[Waterline AI-Driven Data Catalog](https://azuremarketplace.microsoft.com/marketplace/apps/waterline_data.waterline_data) |Spark |Waterline catalogs, organizes, and governs data using AI to auto-tag data with business terms. Waterline's business literate catalog is a critical, success component for self-service analytics, compliance and governance, and IT management initiatives. |
3838

3939
The instructions provided in this article use Azure portal. You can also export the Azure Resource Manager template from the portal or obtain a copy of the Resource Manager template from vendors, and use Azure PowerShell and Azure Classic CLI to deploy the template. See [Create Apache Hadoop clusters on HDInsight using Resource Manager templates](hdinsight-hadoop-create-linux-clusters-arm-templates.md).
4040

4141
## Prerequisites
42-
If you want to install HDInsight applications on an existing HDInsight cluster, you must have an HDInsight cluster. To create one, see [Create clusters](hadoop/apache-hadoop-linux-tutorial-get-started.md#create-cluster). You can also install HDInsight applications when you create an HDInsight cluster.
42+
If you want to install HDInsight applications on an existing HDInsight cluster, you must have an HDInsight cluster. To create one, see [Create clusters](hadoop/apache-hadoop-linux-tutorial-get-started.md). You can also install HDInsight applications when you create an HDInsight cluster.
4343

4444
## Install applications to existing clusters
4545
The following procedure shows you how to install HDInsight applications to an existing HDInsight cluster.
@@ -48,7 +48,7 @@ The following procedure shows you how to install HDInsight applications to an ex
4848

4949
1. Sign in to the [Azure portal](https://portal.azure.com).
5050
2. From the left menu, navigate to **All services** > **Analytics** > **HDInsight clusters**.
51-
3. Select an HDInsight cluster from the list. If you don't have one, you must create one first. see [Create clusters](hadoop/apache-hadoop-linux-tutorial-get-started.md#create-cluster).
51+
3. Select an HDInsight cluster from the list. If you don't have one, you must create one first. see [Create clusters](hadoop/apache-hadoop-linux-tutorial-get-started.md).
5252
4. Under the **Settings** category, select **Applications**. You can see a list of installed applications in the main window.
5353

5454
![HDInsight applications portal menu](./media/hdinsight-apps-install-applications/hdinsight-apps-portal-menu.png)

articles/hdinsight/hdinsight-apps-install-custom-applications.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ An HDInsight application is an application that users can install on an HDInsigh
1818

1919
## Prerequisites
2020

21-
If you want to install HDInsight applications on an existing HDInsight cluster, you must have an HDInsight cluster. To create one, see [Create clusters](hadoop/apache-hadoop-linux-tutorial-get-started.md#create-cluster). You can also install HDInsight applications when you create an HDInsight cluster.
21+
If you want to install HDInsight applications on an existing HDInsight cluster, you must have an HDInsight cluster. To create one, see [Create clusters](hadoop/apache-hadoop-linux-tutorial-get-started.md). You can also install HDInsight applications when you create an HDInsight cluster.
2222

2323
## Install HDInsight applications
2424

articles/hdinsight/hdinsight-create-non-interactive-authentication-dotnet-applications.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ From your non-interactive .NET application, you need:
2222

2323
## Prerequisites
2424

25-
An HDInsight cluster. See the [getting started tutorial](hadoop/apache-hadoop-linux-tutorial-get-started.md#create-cluster).
25+
An HDInsight cluster. See the [getting started tutorial](hadoop/apache-hadoop-linux-tutorial-get-started.md).
2626

2727
## Assign a role to the Azure AD application
2828

articles/hdinsight/interactive-query/apache-hadoop-connect-hive-power-bi-directquery.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ You can leverage the [Apache Hive ODBC driver](../hadoop/apache-hadoop-connect-h
2121
## Prerequisites
2222
Before going through this article, you must have the following items:
2323

24-
* **HDInsight cluster**. The cluster can be either an HDInsight cluster with Apache Hive or a newly released Interactive Query cluster. For creating clusters, see [Create cluster](../hadoop/apache-hadoop-linux-tutorial-get-started.md#create-cluster).
24+
* **HDInsight cluster**. The cluster can be either an HDInsight cluster with Apache Hive or a newly released Interactive Query cluster. For creating clusters, see [Create cluster](../hadoop/apache-hadoop-linux-tutorial-get-started.md).
2525
* **[Microsoft Power BI Desktop](https://powerbi.microsoft.com/desktop/)**. You can download a copy from the [Microsoft Download Center](https://www.microsoft.com/download/details.aspx?id=45331).
2626

2727
## Load data from HDInsight

0 commit comments

Comments
 (0)