Improved Correctness Score

Sreekanth Iyer (Ushta Te Consultancy Services) · Sreekanth Iyer (Ushta Te Consultancy Services) · commit f28b64bbad8e · 2025-01-06T09:52:20.000+05:30
diff --git a/articles/hdinsight/hbase/apache-hbase-backup-replication.md b/articles/hdinsight/hbase/apache-hbase-backup-replication.md
@@ -238,7 +238,7 @@ The general steps to set up replication are:
 5. Copy existing data from the source tables to the destination tables.
 6. Replication automatically copies new data modifications to the source tables into the destination tables.
 
-To enable replication on HDInsight, apply a Script Action to your running source HDInsight cluster. For a walkthrough of enabling replication in your cluster, or to experiment with replication on sample clusters created in virtual networks using Azure Resource Manager templates, see [Configure Apache HBase replication](apache-hbase-replication.md). That article also includes instructions for enabling replication of Phoenix metadata.
+To enable replication on HDInsight, apply a Script Action to the running source HDInsight cluster. For a walkthrough of enabling replication in your cluster, or to experiment with replication on sample clusters created in virtual networks using Azure Resource Manager templates, see [Configure Apache HBase replication](apache-hbase-replication.md). That article also includes instructions for enabling replication of Phoenix metadata.
 
 ## Next steps
 
diff --git a/articles/hdinsight/hbase/apache-hbase-migrate-new-version-new-storage-account.md b/articles/hdinsight/hbase/apache-hbase-migrate-new-version-new-storage-account.md
@@ -1,5 +1,5 @@
 ---
-title: Migrate an HBase cluster to a new version and Storage account - Azure HDInsight
+title: Migrate a HBase cluster to a new version and Storage account - Azure HDInsight
 description: Learn how to migrate an Apache HBase cluster in Azure HDInsight to a newer version with a different Azure Storage account.
 ms.service: azure-hdinsight
 ms.topic: how-to
diff --git a/articles/hdinsight/hbase/apache-hbase-phoenix-performance.md b/articles/hdinsight/hbase/apache-hbase-phoenix-performance.md
@@ -13,13 +13,13 @@ The most important aspect of [Apache Phoenix](https://phoenix.apache.org/) perfo
 
 ## Table schema design
 
-When you create a table in Phoenix, that table is stored in an HBase table. The HBase table contains groups of columns (column families) that are accessed together. A row in the Phoenix table is a row in the HBase table, where each row consists of versioned cells associated with one or more columns. Logically, a single HBase row is a collection of key-value pairs, each having the same rowkey value. That is, each key-value pair has a rowkey attribute, and the value of that rowkey attribute is the same for a particular row.
+When you create a table in Phoenix, that table is stored in a HBase table. The HBase table contains groups of columns (column families) that are accessed together. A row in the Phoenix table is a row in the HBase table, where each row consists of versioned cells associated with one or more columns. Logically, a single HBase row is a collection of key-value pairs, each having the same rowkey value. That is, each key-value pair has a rowkey attribute, and the value of that rowkey attribute is the same for a particular row.
 
 The schema design of a Phoenix table includes the primary key design, column family design, individual column design, and how the data is partitioned.
 
 ### Primary key design
 
-The primary key defined on a table in Phoenix determines how data is stored within the rowkey of the underlying HBase table. In HBase, the only way to access a particular row is with the rowkey. In addition, data stored in an HBase table is sorted by the rowkey. Phoenix builds the rowkey value by concatenating the values of each of the columns in the row, in the order they're defined in the primary key.
+The primary key defined on a table in Phoenix determines how data is stored within the rowkey of the underlying HBase table. In HBase, the only way to access a particular row is with the rowkey. In addition, data stored in a HBase table is sorted by the rowkey. Phoenix builds the rowkey value by concatenating the values of each of the columns in the row, in the order they're defined in the primary key.
 
 For example, a table for contacts has the first name, last name, phone number, and address, all in the same column family. You could define a primary key based on an increasing sequence number:
 
@@ -64,7 +64,7 @@ Also, if certain columns tend to be accessed together, put those columns in the
 
 ### Column design
 
-* Keep VARCHAR columns under about 1 MB because of the I/O costs of large columns. When processing queries, HBase materializes cells in full before sending them over to the client, and the client receives them in full before handing them off to the application code.
+* Keep VARCHAR columns under about 1 MB because of the I/O costs of large columns. When you process queries, HBase materializes cells in full before sending them over to the client, and the client receives them in full before handing them off to the application code.
 * Store column values using a compact format such as protobuf, Avro, msgpack, or BSON. JSON isn't recommended, as it's larger.
 * Consider compressing data before storage to cut latency and I/O costs.
 
@@ -88,7 +88,7 @@ CREATE TABLE CONTACTS (...) SPLIT ON ('CS','EU','NA')
 
 ## Index design
 
-A Phoenix index is an HBase table that stores a copy of some or all of the data from the indexed table. An index improves performance for specific types of queries.
+A Phoenix index is a HBase table that stores a copy of some or all of the data from the indexed table. An index improves performance for specific types of queries.
 
 When you have multiple indexes defined and then query a table, Phoenix automatically selects the best index for the query. The primary index is created automatically based on the primary keys you select.
 
diff --git a/articles/hdinsight/hbase/query-hbase-with-hbase-shell.md b/articles/hdinsight/hbase/query-hbase-with-hbase-shell.md
@@ -17,7 +17,7 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
 
 ## Prerequisites
 
-* An Apache HBase cluster. See [Create cluster](../hadoop/apache-hadoop-linux-tutorial-get-started.md) to create an HDInsight cluster.  Ensure you choose the **HBase** cluster type.
+* An Apache HBase cluster. See [Create cluster](../hadoop/apache-hadoop-linux-tutorial-get-started.md) to create a HDInsight cluster.  Ensure you choose the **HBase** cluster type.
 
 * An SSH client. For more information, see [Connect to HDInsight (Apache Hadoop) using SSH](../hdinsight-hadoop-linux-use-ssh-unix.md).
 
@@ -108,9 +108,9 @@ For more information about the HBase table schema, see [Introduction to Apache H
 
 ## Clean up resources
 
-After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it is not in use. You are also charged for an HDInsight cluster, even when it is not in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.
+After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it is not in use. You are also charged for a HDInsight cluster, even when it is not in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.
 
-To delete a cluster, see [Delete an HDInsight cluster using your browser, PowerShell, or the Azure CLI](../hdinsight-delete-cluster.md).
+To delete a cluster, see [Delete a HDInsight cluster using your browser, PowerShell, or the Azure CLI](../hdinsight-delete-cluster.md).
 
 ## Next steps
 
diff --git a/articles/hdinsight/hdinsight-apps-install-custom-applications.md b/articles/hdinsight/hdinsight-apps-install-custom-applications.md
@@ -11,7 +11,7 @@ ms.date: 01/02/2025
 
 In this article, you'll learn how to install an [Apache Hadoop](https://hadoop.apache.org/) application on Azure HDInsight, which hasn't been published to the Azure portal. The application you'll install in this article is [Hue](https://gethue.com/).
 
-An HDInsight application is an application that users can install on an HDInsight cluster.  These applications can be developed by Microsoft, independent software vendors (ISV) or by yourself.  
+An HDInsight application is an application that users can install on a HDInsight cluster.  These applications can be developed by Microsoft, independent software vendors (ISV) or by yourself.  
 
 ## Prerequisites
 
diff --git a/articles/hdinsight/hdinsight-changing-configs-via-ambari.md b/articles/hdinsight/hdinsight-changing-configs-via-ambari.md
@@ -17,7 +17,7 @@ Log in to  Ambari at `https://CLUSTERNAME.azurehdidnsight.net` with your cluster
 
 :::image type="content" source="./media/hdinsight-changing-configs-via-ambari/apache-ambari-dashboard.png" alt-text="Apache Ambari user dashboard displayed.":::
 
-The Ambari web UI is used to manage hosts, services, alerts, configurations, and views. Ambari can't be used to create an HDInsight cluster, or upgrade services. Also can't manage stacks and versions, decommission or recommission hosts, or add services to the cluster.
+The Ambari web UI is used to manage hosts, services, alerts, configurations, and views. Ambari can't be used to create a HDInsight cluster, or upgrade services. Also can't manage stacks and versions, decommission or recommission hosts, or add services to the cluster.
 
 ## Manage your cluster's configuration
 
diff --git a/articles/hdinsight/hdinsight-create-non-interactive-authentication-dotnet-applications.md b/articles/hdinsight/hdinsight-create-non-interactive-authentication-dotnet-applications.md
@@ -19,7 +19,7 @@ From your non-interactive .NET application, you need:
 
 ## Prerequisites
 
-An HDInsight cluster. See the [getting started tutorial](hadoop/apache-hadoop-linux-tutorial-get-started.md).
+A HDInsight cluster. See the [getting started tutorial](hadoop/apache-hadoop-linux-tutorial-get-started.md).
 
 <a name='assign-a-role-to-the-azure-ad-application'></a>
 
@@ -36,7 +36,7 @@ Assign your Microsoft Entra application a [role](../role-based-access-control/bu
 1. At the top of the page, select **+ Add**.
 1. Follow the instructions to add the Owner role to your Microsoft Entra application. After you successfully add the role, the application is listed under the Owner role.
 
-## Develop an HDInsight client application
+## Develop a HDInsight client application
 
 1. Create a C# console application.
 2. Add the following [NuGet](https://www.nuget.org/) packages:
diff --git a/articles/hdinsight/hdinsight-log-management.md b/articles/hdinsight/hdinsight-log-management.md
@@ -9,7 +9,7 @@ ms.date: 01/02/2025
 
 # Manage logs for a HDInsight cluster
 
-a HDInsight cluster produces various log files. For example, Apache Hadoop and related services, such as Apache Spark, produce detailed job execution logs. Log file management is part of maintaining a healthy HDInsight cluster. There can also be regulatory requirements for log archiving.  Due to the number and size of log files, optimizing log storage and archiving helps with service cost management.
+ HDInsight cluster produces various log files. For example, Apache Hadoop and related services, such as Apache Spark, produce detailed job execution logs. Log file management is part of maintaining a healthy HDInsight cluster. There can also be regulatory requirements for log archiving.  Due to the number and size of log files, optimizing log storage and archiving helps with service cost management.
 
 Managing HDInsight cluster logs includes retaining information about all aspects of the cluster environment. This information includes all associated Azure Service logs, cluster configuration, job execution information, any error states, and other data as needed.
 
diff --git a/articles/hdinsight/hdinsight-operationalize-data-pipeline.md b/articles/hdinsight/hdinsight-operationalize-data-pipeline.md
@@ -159,7 +159,7 @@ The sample data is now available. However, the pipeline requires two Hive tables
 
 5. Select **Execute** to create the table.
 
-    :::image type="content" source="./media/hdinsight-operationalize-data-pipeline/hdi-ambari-services-hive-query.png" alt-text="hdi ambari services hive query.":::
+    :::image type="content" source="./media/hdinsight-operationalize-data-pipeline/hdi-ambari-services-hive-query.png" alt-text="HDInsight Ambari services hive query.":::
 
 6. To create the `flights` table, replace the text in the query text area with the following statements. The `flights` table is a Hive-managed table that partitions data loaded into it by year, month, and day of month. This table will contain all historical flight data, with the lowest granularity present in the source data of one row per flight.
 
@@ -499,7 +499,7 @@ As you can see, the majority of the coordinator is just passing configuration in
     <coordinator-app ... start="2017-01-01T00:00Z" end="2017-01-05T00:00Z" frequency="${coord:days(1)}" ...>
     ```
 
-    A coordinator is responsible for scheduling actions within the `start` and `end` date range, according to the interval specified by the `frequency` attribute. Each scheduled action in turn runs the workflow as configured. In the coordinator definition above, the coordinator is configured to run actions from January 1, 2017 to January 5, 2017. The frequency is set to one day by the [Oozie Expression Language](https://oozie.apache.org/docs/4.2.0/CoordinatorFunctionalSpec.html#a4.4._Frequency_and_Time-Period_Representation) frequency expression `${coord:days(1)}`. This results in the coordinator scheduling an action (and hence the workflow) once per day. For date ranges that are in the past, as in this example, the action will be scheduled to run without delay. The start of the date from which an action is scheduled to run is called the *nominal time*. For example, to process the data for January 1, 2017 the coordinator will schedule action with a nominal time of 2017-01-01T00:00:00 GMT.
+    A coordinator is responsible for scheduling actions within the `start` and `end` date range, according to the interval specified by the `frequency` attribute. Each scheduled action in turn runs the workflow as configured. In the coordinator definition above, the coordinator is configured to run actions from January 1, 2017 to January 5, 2017. The frequency is set to one day by the [Oozie Expression Language](https://oozie.apache.org/docs/4.2.0/CoordinatorFunctionalSpec.html#a4.4._Frequency_and_Time-Period_Representation) frequency expression `${coord:days(1)}`. This results in the coordinator scheduling an action (and hence the workflow) once per day. For date ranges that are in the past, as in this example, the action will be scheduled to run without delay. The start of the date from which an action is scheduled to run is call the *nominal time*. For example, to process the data for January 1, 2017 the coordinator will schedule action with a nominal time of 2017-01-01T00:00:00 GMT.
 
 * Point 2: Within the date range of the workflow, the `dataset` element specifies where to look in HDFS for the data for a particular date range, and configures how Oozie determines whether the data is available yet for processing.