Skip to content

Commit 8432444

Browse files
author
Sreekanth Iyer (Ushta Te Consultancy Services)
committed
Improved corectness score
1 parent 5c0f6aa commit 8432444

8 files changed

+23
-23
lines changed

articles/hdinsight/hadoop/apache-hadoop-linux-create-cluster-get-started-portal.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: 'Quickstart: Apache Hadoop, Apache Hive & Azure HDInsight portal'
3-
description: In this quickstart, you use the Azure portal to create an HDInsight Hadoop cluster
3+
description: In this quickstart, you use the Azure portal to create a HDInsight Hadoop cluster
44
keywords: hadoop getting started,hadoop linux,hadoop quickstart,hive getting started,hive quickstart
55
ms.service: azure-hdinsight
66
ms.topic: quickstart
@@ -13,7 +13,7 @@ ms.date: 11/25/2024
1313

1414
In this article, you learn how to create Apache Hadoop clusters in HDInsight using Azure portal, and then run Apache Hive jobs in HDInsight. Most of Hadoop jobs are batch jobs. You create a cluster, run some jobs, and then delete the cluster. In this article, you perform all the three tasks. For in-depth explanations of available configurations, see [Set up clusters in HDInsight](../hdinsight-hadoop-provision-linux-clusters.md). For more information regarding the use of the portal to create clusters, see [Create clusters in the portal](../hdinsight-hadoop-create-linux-clusters-portal.md).
1515

16-
In this quickstart, you use the Azure portal to create an HDInsight Hadoop cluster. You can also create a cluster using the [Azure Resource Manager template](apache-hadoop-linux-tutorial-get-started.md).
16+
In this quickstart, you use the Azure portal to create a HDInsight Hadoop cluster. You can also create a cluster using the [Azure Resource Manager template](apache-hadoop-linux-tutorial-get-started.md).
1717

1818
Currently, HDInsight comes with [seven different cluster types](../hdinsight-overview.md#cluster-types-in-hdinsight). Each cluster type supports a different set of components. All cluster types support Hive. For a list of supported components in HDInsight, see [What's new in the Apache Hadoop cluster versions provided by HDInsight?](../hdinsight-component-versioning.md)
1919

@@ -119,7 +119,7 @@ After you've completed a Hive job, you can [export the results to Azure SQL Data
119119

120120
## Clean up resources
121121

122-
After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for an HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
122+
After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for a HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
123123

124124
> [!NOTE]
125125
> If you are *immediately* proceeding to the next article to learn how to run ETL operations using Hadoop on HDInsight, you may want to keep the cluster running. This is because in the tutorial you have to create a Hadoop cluster again. However, if you are not going through the next article right away, you must delete the cluster now.

articles/hdinsight/hdinsight-hadoop-create-linux-clusters-azure-cli.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ The steps in this document walk-through creating a HDInsight 4.0 cluster using t
124124
125125
## Clean up resources
126126
127-
After you complete the article, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for an HDInsight cluster, even when it's not in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
127+
After you complete the article, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for a HDInsight cluster, even when it's not in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
128128
129129
Enter all or some of the following commands to remove resources:
130130
@@ -155,7 +155,7 @@ If you run into issues with creating HDInsight clusters, see [access control req
155155

156156
## Next steps
157157

158-
Now that you've successfully created an HDInsight cluster using the Azure CLI, use the following to learn how to work with your cluster:
158+
Now that you've successfully created a HDInsight cluster using the Azure CLI, use the following to learn how to work with your cluster:
159159

160160
### Apache Hadoop clusters
161161

articles/hdinsight/hdinsight-hadoop-create-linux-clusters-portal.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ From the **Basics** tab, provide the following information:
5151
|Cluster login password|Provide the password.|
5252
|Confirm cluster login password|Reenter the password|
5353
|Secure Shell (SSH) username|Provide the username, default is **sshuser**|
54-
|Use cluster login password for SSH|If you want the same SSH password as the admin password you specified earlier, select the **Use cluster login password for SSH** check box. If not, provide either a **PASSWORD** or **PUBLIC KEY** to authenticate the SSH user. A public key is the approach we recommend. Choose **Select** at the bottom to save the credentials configuration. For more information, see [Connect to HDInsight (Apache Hadoop) by using SSH](hdinsight-hadoop-linux-use-ssh-unix.md).|
54+
|Use cluster login password for SSH|If you want the same SSH password as the admin password you specified earlier, select the **`Use cluster login password for SSH`** check box. If not, provide either a **PASSWORD** or **PUBLIC KEY** to authenticate the SSH user. A public key is the approach we recommend. Choose **Select** at the bottom to save the credentials configuration. For more information, see [Connect to HDInsight (Apache Hadoop) by using SSH](hdinsight-hadoop-linux-use-ssh-unix.md).|
5555

5656
Select **Next: Storage >>** to advance to the next tab.
5757

@@ -155,15 +155,15 @@ Some of the icons in the window are explained as follows:
155155

156156
## Delete the cluster
157157

158-
See [Delete an HDInsight cluster using your browser, PowerShell, or the Azure CLI](./hdinsight-delete-cluster.md).
158+
See [Delete a HDInsight cluster using your browser, PowerShell, or the Azure CLI](./hdinsight-delete-cluster.md).
159159

160160
## Troubleshoot
161161

162162
If you run into issues with creating HDInsight clusters, see [access control requirements](./hdinsight-hadoop-customize-cluster-linux.md#access-control).
163163

164164
## Next steps
165165

166-
You've successfully created an HDInsight cluster. Now learn how to work with your cluster.
166+
You've successfully created a HDInsight cluster. Now learn how to work with your cluster.
167167

168168
* [Use Apache Hive with HDInsight](hadoop/hdinsight-use-hive.md)
169169
* [Get started with Apache HBase on HDInsight](hbase/apache-hbase-tutorial-get-started-linux.md)

articles/hdinsight/kafka/apache-kafka-quickstart-powershell.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -245,7 +245,7 @@ Kafka stores streams of data in *topics*. You can use the `kafka-topics.sh` util
245245

246246
For information on the number of fault domains in a region, see the [Availability of Linux virtual machines](/azure/virtual-machines/availability) document.
247247

248-
Kafka is not aware of Azure fault domains. When creating partition replicas for topics, it may not distribute replicas properly for high availability.
248+
Kafka is not aware of Azure fault domains. When yo create partition replicas for topics, it may not distribute replicas properly for high availability.
249249

250250
To ensure high availability, use the [Apache Kafka partition rebalance tool](https://github.com/hdinsight/hdinsight-kafka-tools). This tool must be ran from an SSH connection to the head node of your Kafka cluster.
251251

articles/hdinsight/spark/apache-spark-create-cluster-cli.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ If you're using multiple clusters together, you can create a virtual network, an
106106
107107
## Clean up resources
108108
109-
After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for an HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
109+
After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for a HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
110110
111111
Enter all or some of the following commands to remove resources:
112112
@@ -133,7 +133,7 @@ az group delete \
133133

134134
## Next steps
135135

136-
In this quickstart, you learned how to create an Apache Spark cluster in Azure HDInsight using Azure CLI. Advance to the next tutorial to learn how to use an HDInsight cluster to run interactive queries on sample data.
136+
In this quickstart, you learned how to create an Apache Spark cluster in Azure HDInsight using Azure CLI. Advance to the next tutorial to learn how to use a HDInsight cluster to run interactive queries on sample data.
137137

138138
> [!div class="nextstepaction"]
139139
> [Run interactive queries on Apache Spark](./apache-spark-load-data-run-query.md)

articles/hdinsight/spark/apache-spark-jupyter-spark-sql-use-portal.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ An Azure account with an active subscription. [Create an account for free](https
2525

2626
## Create an Apache Spark cluster in HDInsight
2727

28-
You use the Azure portal to create an HDInsight cluster that uses Azure Storage Blobs as the cluster storage. For more information on using Data Lake Storage Gen2, see [Quickstart: Set up clusters in HDInsight](../hdinsight-hadoop-provision-linux-clusters.md).
28+
You use the Azure portal to create a HDInsight cluster that uses Azure Storage Blobs as the cluster storage. For more information on using Data Lake Storage Gen2, see [Quickstart: Set up clusters in HDInsight](../hdinsight-hadoop-provision-linux-clusters.md).
2929

3030
1. Sign in to the [Azure portal](https://portal.azure.com/).
3131

@@ -121,17 +121,17 @@ SQL (Structured Query Language) is the most common and widely used language for
121121
122122
## Clean up resources
123123
124-
HDInsight saves your data in Azure Storage or Azure Data Lake Storage, so you can safely delete a cluster when it isn't in use. You're also charged for an HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use. If you plan to work on the tutorial listed in [Next steps](#next-steps) immediately, you might want to keep the cluster.
124+
HDInsight saves your data in Azure Storage or Azure Data Lake Storage, so you can safely delete a cluster when it isn't in use. You're also charged for a HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use. If you plan to work on the tutorial listed in [Next steps](#next-steps) immediately, you might want to keep the cluster.
125125
126126
Switch back to the Azure portal, and select **Delete**.
127127
128-
:::image type="content" source="./media/apache-spark-jupyter-spark-sql-use-portal/hdinsight-azure-portal-delete-cluster.png " alt-text="Azure portal delete an HDInsight cluster." border="true":::sight cluster" border="true":::
128+
:::image type="content" source="./media/apache-spark-jupyter-spark-sql-use-portal/hdinsight-azure-portal-delete-cluster.png " alt-text="Azure portal delete a HDInsight cluster." border="true":::sight cluster" border="true":::
129129
130130
You can also select the resource group name to open the resource group page, and then select **Delete resource group**. By deleting the resource group, you delete both the HDInsight cluster, and the default storage account.
131131
132132
## Next steps
133133
134-
In this quickstart, you learned how to create an Apache Spark cluster in HDInsight and run a basic Spark SQL query. Advance to the next tutorial to learn how to use an HDInsight cluster to run interactive queries on sample data.
134+
In this quickstart, you learned how to create an Apache Spark cluster in HDInsight and run a basic Spark SQL query. Advance to the next tutorial to learn how to use a HDInsight cluster to run interactive queries on sample data.
135135
136136
> [!div class="nextstepaction"]
137137
> [Run interactive queries on Apache Spark](./apache-spark-load-data-run-query.md)

articles/hdinsight/spark/apache-spark-jupyter-spark-sql-use-powershell.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ If you're using multiple clusters together, you can create a virtual network, an
2424
## Create an Apache Spark cluster in HDInsight
2525

2626
> [!IMPORTANT]
27-
> Billing for HDInsight clusters is prorated per minute, whether you are using them or not. Be sure to delete your cluster after you have finished using it. For more information, see the [Clean up resources](#clean-up-resources) section of this article.
27+
> Billing for HDInsight clusters is prorated per minute, whether you're using them or not. Be sure to delete your cluster after you have finished using it. For more information, see the [Clean up resources](#clean-up-resources) section of this article.
2828
29-
Creating an HDInsight cluster includes creating the following Azure objects and resources:
29+
Creating a HDInsight cluster includes creating the following Azure objects and resources:
3030

3131
- An Azure resource group. An Azure resource group is a container for Azure resources.
3232
- An Azure storage account or Azure Data Lake Storage. Each HDInsight cluster requires a dependent data storage. In this quickstart, you create a cluster that uses Azure Storage Blobs as the cluster storage. For more information on using Data Lake Storage Gen2, see [Quickstart: Set up clusters in HDInsight](../hdinsight-hadoop-provision-linux-clusters.md).
@@ -36,7 +36,7 @@ You use a PowerShell script to create the resources.
3636

3737
[!INCLUDE [updated-for-az](~/reusable-content/ce-skilling/azure/includes/updated-for-az.md)]
3838

39-
When you run the PowerShell script, you are prompted to enter the following values:
39+
When you run the PowerShell script, you're prompted to enter the following values:
4040

4141
|Parameter|Value|
4242
|------|------|
@@ -188,11 +188,11 @@ SQL (Structured Query Language) is the most common and widely used language for
188188
189189
## Clean up resources
190190
191-
HDInsight saves your data in Azure Storage or Azure Data Lake Storage, so you can safely delete a cluster when it is not in use. You are also charged for an HDInsight cluster, even when it is not in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use. If you plan to work on the tutorial listed in [Next steps](#next-steps) immediately, you might want to keep the cluster.
191+
HDInsight saves your data in Azure Storage or Azure Data Lake Storage, so you can safely delete a cluster when it isn't in use. You're also charged for a HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use. If you plan to work on the tutorial listed in [Next steps](#next-steps) immediately, you might want to keep the cluster.
192192
193193
Switch back to the Azure portal, and select **Delete**.
194194
195-
:::image type="content" source="./media/apache-spark-jupyter-spark-sql-use-powershell/hdinsight-azure-portal-delete-cluster.png " alt-text="Azure portal delete an HDInsight cluster." border="true":::
195+
:::image type="content" source="./media/apache-spark-jupyter-spark-sql-use-powershell/hdinsight-azure-portal-delete-cluster.png " alt-text="Azure portal delete a HDInsight cluster." border="true":::
196196
197197
You can also select the resource group name to open the resource group page, and then select **Delete resource group**. By deleting the resource group, you delete both the HDInsight cluster, and the default storage account.
198198
@@ -221,7 +221,7 @@ Remove-AzResourceGroup `
221221

222222
## Next steps
223223

224-
In this quickstart, you learned how to create an Apache Spark cluster in HDInsight and run a basic Spark SQL query. Advance to the next tutorial to learn how to use an HDInsight cluster to run interactive queries on sample data.
224+
In this quickstart, you learned how to create an Apache Spark cluster in HDInsight and run a basic Spark SQL query. Advance to the next tutorial to learn how to use a HDInsight cluster to run interactive queries on sample data.
225225

226226
> [!div class="nextstepaction"]
227227
> [Run interactive queries on Apache Spark](./apache-spark-load-data-run-query.md)

articles/hdinsight/spark/apache-spark-manage-dependencies.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,14 +57,14 @@ Use comma-separated list of jar paths for multiple jar files, Globs are allowed.
5757
%%configure { "conf": {"spark.jars": "wasb://[email protected]/libs/azure-cosmosdb-spark_2.3.0_2.11-1.3.3.jar" }}
5858
```
5959

60-
After configuring external packages, you can run import in code cell to verify if the packages has been placed correctly.
60+
After configuring external packages, you can run import in code cell to verify if the packages have been placed correctly.
6161

6262
```scala
6363
import com.microsoft.azure.cosmosdb.spark._
6464
```
6565

6666
### Use Azure Toolkit for IntelliJ
67-
[Azure Toolkit for IntelliJ plug-in](./apache-spark-intellij-tool-plugin.md) provides UI experience to submit Spark Scala application to an HDInsight cluster. It provides `Referenced Jars` and `Referenced Files` properties to configure jar libs paths when submitting the Spark application. See more details about [How to use Azure Toolkit for IntelliJ plug-in for HDInsight](./apache-spark-intellij-tool-plugin.md#run-a-spark-scala-application-on-an-hdinsight-spark-cluster).
67+
[Azure Toolkit for IntelliJ plug-in](./apache-spark-intellij-tool-plugin.md) provides UI experience to submit Spark Scala application to HDInsight cluster. It provides `Referenced Jars` and `Referenced Files` properties to configure jar libs paths when submitting the Spark application. See more details about [How to use Azure Toolkit for IntelliJ plug-in for HDInsight](./apache-spark-intellij-tool-plugin.md#run-a-spark-scala-application-on-an-hdinsight-spark-cluster).
6868

6969
:::image type="content" source="./media/apache-spark-intellij-tool-plugin/hdi-submit-spark-app-02.png" alt-text="The Spark Submission dialog box." border="true":::
7070

0 commit comments

Comments
 (0)