Improved Correctness Score

Sreekanth Iyer (Ushta Te Consultancy Services) · Sreekanth Iyer (Ushta Te Consultancy Services) · commit 400fb4b237d7 · 2025-01-03T15:39:12.000+05:30
diff --git a/articles/hdinsight/hdinsight-hadoop-create-linux-clusters-azure-powershell.md b/articles/hdinsight/hdinsight-hadoop-create-linux-clusters-azure-powershell.md
@@ -26,23 +26,23 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
 
 [!INCLUDE [delete-cluster-warning](includes/hdinsight-delete-cluster-warning.md)]
 
-To create an HDInsight cluster by using Azure PowerShell, you must complete the following procedures:
+To create a HDInsight cluster by using Azure PowerShell, you must complete the following procedures:
 
 * Create an Azure resource group
 * Create an Azure Storage account
 * Create an Azure Blob container
-* Create an HDInsight cluster
+* Create a HDInsight cluster
 
 > [!NOTE]
-> Using PowerShell to create an HDInsight cluster with Azure Data Lake Storage Gen2 is not currently supported.
+> Using PowerShell to create a HDInsight cluster with Azure Data Lake Storage Gen2 is not currently supported.
 
 The following script demonstrates how to create a new cluster:
 
-[!code-powershell[main](../../azure_powershell_scripts/hdinsight/create-cluster/create-cluster.ps1?range=5-74)]
+[!Code-powershell[main](../../azure_powershell_scripts/hdinsight/create-cluster/create-cluster.ps1?range=5-74)]
 
 The values you specify for the cluster login are used to create the Hadoop user account for the cluster. Use this account to connect to services hosted on the cluster such as web UIs or REST APIs.
 
-The values you specify for the SSH user are used to create the SSH user for the cluster. Use this account to start a remote SSH session on the cluster and run jobs. For more information, see the [Use SSH with HDInsight](hdinsight-hadoop-linux-use-ssh-unix.md) document.
+The values you specify for the SSH user are used to create the SSH user for the cluster. Use this account to start a remote SSH session on the cluster and run jobs. For more information, see [Use SSH with HDInsight](hdinsight-hadoop-linux-use-ssh-unix.md) document.
 
 > [!IMPORTANT]  
 > If you plan to use more than 32 worker nodes (either at cluster creation or by scaling the cluster after creation), you must also specify a head node size with at least 8 cores and 14 GB of RAM.
@@ -53,7 +53,7 @@ It can take up to 20 minutes to create a cluster.
 
 ## Create cluster: Configuration object
 
-You can also create an HDInsight configuration object using [`New-AzHDInsightClusterConfig`](/powershell/module/az.hdinsight/new-azhdinsightclusterconfig) cmdlet. You can then modify this configuration object to enable additional configuration options for your cluster. Finally, use the `-Config` parameter of the [`New-AzHDInsightCluster`](/powershell/module/az.hdinsight/new-azhdinsightcluster) cmdlet to use the configuration.
+You can also create a HDInsight configuration object using [`New-AzHDInsightClusterConfig`](/powershell/module/az.hdinsight/new-azhdinsightclusterconfig) cmdlet. You can then modify this configuration object to enable additional configuration options for your cluster. Finally, use the `-Config` parameter of the [`New-AzHDInsightCluster`](/powershell/module/az.hdinsight/new-azhdinsightcluster) cmdlet to use the configuration.
 
 ## Customize clusters
 
@@ -70,7 +70,7 @@ If you run into issues with creating HDInsight clusters, see [access control req
 
 ## Next steps
 
-Now that you've successfully created an HDInsight cluster, use the following resources to learn how to work with your cluster.
+Now that you've successfully created a HDInsight cluster, use the following resources to learn how to work with your cluster.
 
 ### Apache Hadoop clusters
 
diff --git a/articles/hdinsight/hdinsight-log-management.md b/articles/hdinsight/hdinsight-log-management.md
@@ -1,15 +1,15 @@
 ---
-title: Manage logs for an HDInsight cluster - Azure HDInsight 
+title: Manage logs for a HDInsight cluster - Azure HDInsight 
 description: Determine the types, sizes, and retention policies for HDInsight activity log files.
 ms.service: azure-hdinsight
 ms.topic: how-to
 ms.custom: hdinsightactive
 ms.date: 01/02/2025
 ---
 
-# Manage logs for an HDInsight cluster
+# Manage logs for a HDInsight cluster
 
-An HDInsight cluster produces various log files. For example, Apache Hadoop and related services, such as Apache Spark, produce detailed job execution logs. Log file management is part of maintaining a healthy HDInsight cluster. There can also be regulatory requirements for log archiving.  Due to the number and size of log files, optimizing log storage and archiving helps with service cost management.
+a HDInsight cluster produces various log files. For example, Apache Hadoop and related services, such as Apache Spark, produce detailed job execution logs. Log file management is part of maintaining a healthy HDInsight cluster. There can also be regulatory requirements for log archiving.  Due to the number and size of log files, optimizing log storage and archiving helps with service cost management.
 
 Managing HDInsight cluster logs includes retaining information about all aspects of the cluster environment. This information includes all associated Azure Service logs, cluster configuration, job execution information, any error states, and other data as needed.
 
@@ -157,7 +157,7 @@ Alternatively, you can script log archiving with PowerShell.
 ### Accessing Azure Storage metrics
 
 Azure Storage can be configured to log storage operations and access. You can use these detailed logs for capacity monitoring and planning, and for auditing requests to storage. The logged information includes latency details, enabling you to monitor and fine-tune the performance of your solutions.
-You can use the .NET SDK for Hadoop to examine the log files generated for the Azure Storage that holds the data for an HDInsight cluster.
+You can use the .NET SDK for Hadoop to examine the log files generated for the Azure Storage that holds the data for a HDInsight cluster.
 
 ### Control the size and number of backup indexes for old log files
 
diff --git a/articles/hdinsight/hdinsight-operationalize-data-pipeline.md b/articles/hdinsight/hdinsight-operationalize-data-pipeline.md
@@ -29,7 +29,7 @@ The following diagram illustrates the example pipeline.
 
 ## Apache Oozie solution overview
 
-This pipeline uses Apache Oozie running on an HDInsight Hadoop cluster.
+This pipeline uses Apache Oozie running on a HDInsight Hadoop cluster.
 
 Oozie describes its pipelines in terms of *actions*, *workflows*, and *coordinators*. Actions determine the actual work to perform, such as running a Hive query. Workflows define the sequence of actions. Coordinators define the schedule for when the workflow is run. Coordinators can also wait on the availability of new data before launching an instance of the workflow.
 
@@ -39,7 +39,7 @@ The following diagram shows the high-level design of this example Oozie pipeline
 
 ## Provision Azure resources
 
-This pipeline requires an Azure SQL Database and an HDInsight Hadoop cluster in the same location. The Azure SQL Database stores both the summary data produced by the pipeline and the Oozie Metadata Store.
+This pipeline requires an Azure SQL Database and a HDInsight Hadoop cluster in the same location. The Azure SQL Database stores both the summary data produced by the pipeline and the Oozie Metadata Store.
 
 ### Provision Azure SQL Database
 
@@ -70,7 +70,7 @@ Your Azure SQL Database is now ready.
 
 ### Provision an Apache Hadoop Cluster
 
-Create an Apache Hadoop cluster with a custom metastore. During cluster creation from the portal, from the **Storage** tab, ensure you select your SQL Database under **Metastore settings**. For more information on selecting a metastore, see [Select a custom metastore during cluster creation](./hdinsight-use-external-metadata-stores.md#select-a-custom-metastore-during-cluster-creation). For more information on cluster creation, see [Get Started with HDInsight on Linux](hadoop/apache-hadoop-linux-tutorial-get-started.md).
+Create an Apache Hadoop cluster with a custom metastore. During cluster creation from the portal, from the **Storage** tab, ensures you select your SQL Database under **Metastore settings**. For more information on selecting a metastore, see [Select a custom metastore during cluster creation](./hdinsight-use-external-metadata-stores.md#select-a-custom-metastore-during-cluster-creation). For more information on cluster creation, see [Get Started with HDInsight on Linux](hadoop/apache-hadoop-linux-tutorial-get-started.md).
 
 ## Verify SSH tunneling set up
 
diff --git a/articles/hdinsight/interactive-query/apache-hive-warehouse-connector.md b/articles/hdinsight/interactive-query/apache-hive-warehouse-connector.md
@@ -60,9 +60,9 @@ Hive Warehouse Connector needs separate clusters for Spark and Interactive Query
 
 ### Create clusters
 
-1. Create an HDInsight Spark **4.0** cluster with a storage account and a custom Azure virtual network. For information on creating a cluster in an Azure virtual network, see [Add HDInsight to an existing virtual network](../../hdinsight/hdinsight-plan-virtual-network-deployment.md#existingvnet).
+1. Create a HDInsight Spark **4.0** cluster with a storage account and a custom Azure virtual network. For information on creating a cluster in an Azure virtual network, see [Add HDInsight to an existing virtual network](../../hdinsight/hdinsight-plan-virtual-network-deployment.md#existingvnet).
 
-1. Create an HDInsight Interactive Query (LLAP) **4.0** cluster with the same storage account and Azure virtual network as the Spark cluster.
+1. Create a HDInsight Interactive Query (LLAP) **4.0** cluster with the same storage account and Azure virtual network as the Spark cluster.
 
 ### Configure HWC settings
 
@@ -172,7 +172,7 @@ This is a way to run Spark interactively through a modified version of the Scala
 
 Spark-submit is a utility to submit any Spark program (or job) to Spark clusters.
 
-The spark-submit job will set up and configure Spark and Hive Warehouse Connector as per our instructions, execute the program we pass to it, then cleanly release the resources that were being used.
+The spark-submited job will set up and configure Spark and Hive Warehouse Connector as per our instructions, execute the program we pass to it, then cleanly release the resources that were being used.
 
 Once you build the scala/java code along with the dependencies into an assembly jar, use the below command to launch a Spark application. Replace `<VERSION>`, and `<APP_JAR_PATH>` with the actual values.
 
diff --git a/articles/hdinsight/interactive-query/quickstart-resource-manager-template.md b/articles/hdinsight/interactive-query/quickstart-resource-manager-template.md
@@ -31,7 +31,7 @@ The template used in this quickstart is from [Azure Quickstart Templates](https:
 Two Azure resources are defined in the template:
 
 * [Microsoft.Storage/storageAccounts](/azure/templates/microsoft.storage/storageaccounts): create an Azure Storage Account.
-* [Microsoft.HDInsight/cluster](/azure/templates/microsoft.hdinsight/clusters): create an HDInsight cluster.
+* [Microsoft.HDInsight/cluster](/azure/templates/microsoft.hdinsight/clusters): create a HDInsight cluster.
 
 ### Deploy the template
 
@@ -48,7 +48,7 @@ Two Azure resources are defined in the template:
     |Location|The value will autopopulate with the location used for the resource group.|
     |Cluster Name|Enter a globally unique name. For this template, use only lowercase letters, and numbers.|
     |Cluster Login User Name|Provide the username, default is `admin`.|
-    |Cluster Login Password|Provide a password. The password must be at least 10 characters in length and must contain at least one digit, one uppercase, and one lower case letter, one non-alphanumeric character (except characters ```' ` "``` ). |
+    |Cluster Login Password|Provide a password. The password must be at least 10 characters in length and must contain at least one digit, one uppercase, and one lower case letter, one nonalphanumeric character (except characters ```' ` "``` ). |
     |Ssh User Name|Provide the username, default is sshuser|
     |Ssh Password|Provide the password.|
 
@@ -62,7 +62,7 @@ Once the cluster is created, you'll receive a **Deployment succeeded** notificat
 
 ## Clean up resources
 
-After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for an HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
+After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for a HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
 
 From the Azure portal, navigate to your cluster, and select **Delete**.
 
diff --git a/articles/hdinsight/kafka/kafka-troubleshoot-insufficient-domains.md b/articles/hdinsight/kafka/kafka-troubleshoot-insufficient-domains.md
@@ -16,15 +16,15 @@ Receive error message similar to `not sufficient fault domains in region` when a
 
 ## Cause
 
-A fault domain is a logical grouping of underlying hardware in an Azure data center. Each fault domain shares a common power source and network switch. The virtual machines and managed disks that implement the nodes within an HDInsight cluster are distributed across these fault domains. This architecture limits the potential impact of physical hardware failures.
+A fault domain is a logical grouping of underlying hardware in an Azure data center. Each fault domain shares a common power source and network switch. The virtual machines and managed disks that implement the nodes within a HDInsight cluster are distributed across these fault domains. This architecture limits the potential impact of physical hardware failures.
 
 Each Azure region has a specific number of fault domains. For a list of domains and the number of fault domains they contain, refer to documentation on [Availability Sets](/azure/virtual-machines/availability).
 
 In HDInsight, Kafka clusters are required to be provisioned in a region with at least three Fault domains.
 
 ## Resolution
 
-If the region you wish to create the cluster does not have sufficient fault domains, reach out to product team to allow provisioning of the cluster even if there are not three fault domains.
+If the region you wish to create the cluster doesn't have sufficient fault domains, reach out to product team to allow provisioning of the cluster even if there aren't three fault domains.
 
 ## Next steps
 
diff --git a/articles/hdinsight/spark/apache-spark-microsoft-cognitive-toolkit.md b/articles/hdinsight/spark/apache-spark-microsoft-cognitive-toolkit.md
@@ -25,7 +25,7 @@ In this article, you do the following steps.
 
 This solution is divided between this article and a Jupyter Notebook that you upload as part of this article. In this article, you complete the following steps:
 
-* Run a script action on an HDInsight Spark cluster to install Microsoft Cognitive Toolkit and Python packages.
+* Run a script action on a HDInsight Spark cluster to install Microsoft Cognitive Toolkit and Python packages.
 * Upload the Jupyter Notebook that runs the solution to the HDInsight Spark cluster.
 
 The following remaining steps are covered in the Jupyter Notebook.