Improved Corectness Score

Sreekanth Iyer (Ushta Te Consultancy Services) · Sreekanth Iyer (Ushta Te Consultancy Services) · commit 3f3caadb592b · 2024-06-14T17:09:24.000+05:30
diff --git a/articles/hdinsight/hdinsight-hadoop-hive-out-of-memory-error-oom.md b/articles/hdinsight/hdinsight-hadoop-hive-out-of-memory-error-oom.md
@@ -97,7 +97,7 @@ The **hive.auto.convert.join.noconditionaltask** in the hive-site.xml file was s
 </property>
 ```
 
-It's likely map join was the cause of the Java Heap Space out of memory error. As explained in the blog post [Hadoop Yarn memory settings in HDInsight](/archive/blogs/shanyu/hadoop-yarn-memory-settings-in-hdinsight), when Tez execution engine is used the heap space used actually belongs to the Tez container. See the following image describing the Tez container memory.
+It's likely map join was the cause of the Java Heap Space out of memory error. As explained in the blog post [Hadoop Yarn memory settings in HDInsight](/archive/blogs/shanyu/hadoop-yarn-memory-settings-in-hdinsight), when Tez execution engine used the heap space used actually belongs to the Tez container. See the following image describing the Tez container memory.
 
 :::image type="content" source="./media/hdinsight-hadoop-hive-out-of-memory-error-oom/hive-out-of-memory-error-oom-tez-container-memory.png" alt-text="Tez container memory diagram: Hive out of memory error." border="false":::
 
diff --git a/articles/hdinsight/hdinsight-streaming-at-scale-overview.md b/articles/hdinsight/hdinsight-streaming-at-scale-overview.md
@@ -31,7 +31,7 @@ For more information, see [What is Apache Spark Streaming?](./spark/apache-spark
 
 Although you can specify the number of nodes in your cluster during creation, you may want to grow or shrink the cluster to match the workload. All HDInsight clusters allow you to [change the number of nodes in the cluster](hdinsight-administer-use-portal-linux.md#scale-clusters). Spark clusters can be dropped with no loss of data, as all  data is stored in Azure Storage or Data Lake Storage.
 
-There are advantages to decoupling technologies. For instance, Kafka is an event buffering technology, so its very IO intensive and doesn't need much processing power. In comparison, stream processors such as Spark Streaming are compute-intensive, requiring more powerful VMs. By having these technologies decoupled into different clusters, you can scale them independently while best utilizing the VMs.
+There are advantages to decoupling technologies. For instance, Kafka is an event buffering technology, so it's very IO intensive and doesn't need much processing power. In comparison, stream processors such as Spark Streaming are compute-intensive, requiring more powerful VMs. By having these technologies decoupled into different clusters, you can scale them independently while best utilizing the VMs.
 
 ### Scale the stream buffering layer
 
diff --git a/articles/hdinsight/hdinsight-use-oozie-linux-mac.md b/articles/hdinsight/hdinsight-use-oozie-linux-mac.md
@@ -37,7 +37,7 @@ The workflow used in this document contains two actions. Actions are definitions
 
 :::image type="content" source="./media/hdinsight-use-oozie-linux-mac/oozie-workflow-diagram.png" alt-text="HDInsight oozie workflow diagram." border="false":::
 
-1. A Hive action runs an HiveQL script to extract records from the `hivesampletable` that's included with HDInsight. Each row of data describes a visit from a specific mobile device. The record format appears like the following text:
+1. A Hive action runs a HiveQL script to extract records from the `hivesampletable` that's included with HDInsight. Each row of data describes a visit from a specific mobile device. The record format appears like the following text:
 
     ```output
     8       18:54:20        en-US   Android Samsung SCH-i500        California     United States    13.9204007      0       0
@@ -201,7 +201,7 @@ Oozie workflow definitions are written in Hadoop Process Definition Language (hP
 
    * `RunHiveScript`: This action is the start action and runs the `useooziewf.hql` Hive script.
 
-   * `RunSqoopExport`: This action exports the data created from the Hive script to a SQL database by using Sqoop. This action only runs if the `RunHiveScript` action is successful.
+   * `RunSqoopExport`: This action exports the data created from the Hive script to an SQL database by using Sqoop. This action only runs if the `RunHiveScript` action is successful.
 
      The workflow has several entries, such as `${jobTracker}`. You'll replace these entries with the values you use in the job definition. You'll create the job definition later in this document.
 
@@ -523,7 +523,7 @@ To access the Oozie web UI, complete the following steps:
 
 6. From the **Job Info** tab, you can see the basic job information and the individual actions within the job. You can use the tabs at the top to view the **Job Definition**, **Job Configuration**, access the **Job Log**, or view a directed acyclic graph (DAG) of the job under **Job DAG**.
 
-   * **Job Log**: Select the **Get Logs** button to get all logs for the job, or use the **Enter Search Filter** field to filter the logs.
+   * **Job Log**: Select the **Get Logs** button to get all logs for the job, or use the `Enter Search Filter` field to filter the logs.
 
        :::image type="content" source="./media/hdinsight-use-oozie-linux-mac/hdinsight-oozie-job-log.png" alt-text="HDInsight Apache Oozie job log." border="true":::
 
diff --git a/articles/hdinsight/interactive-query/apache-hive-replication.md b/articles/hdinsight/interactive-query/apache-hive-replication.md
@@ -10,11 +10,11 @@ ms.date: 06/14/2024
 
 In the context of databases and warehouses, replication is the process of duplicating entities from one warehouse to another. Duplication can apply to an entire database or to a smaller level, such as a table or partition. The objective is to have a replica that changes whenever the base entity changes. Replication on Apache Hive focuses on disaster recovery and offers unidirectional primary-copy replication. In HDInsight clusters, Hive Replication can be used to unidirectionally replicate the Hive metastore and the associated underlying data lake on Azure Data Lake Storage Gen2.  
 
-Hive Replication has evolved over the years with newer versions providing better functionality and being faster and less resource intensive. In this article, we discuss Hive Replication (Replv2) which is supported in both HDInsight 3.6 and HDInsight 4.0 cluster types.
+Hive Replication has evolved over the years with newer versions providing better functionality and being faster and less resource intensive. In this article, we discuss Hive Replication `(Replv2)` which is supported in both HDInsight 3.6 and HDInsight 4.0 cluster types.
 
-## Advantages of replv2
+## Advantages of `replv2`
 
-[Hive ReplicationV2](https://cwiki.apache.org/confluence/display/Hive/HiveReplicationv2Development) (also called Replv2) has the following advantages over the first version of Hive replication that used Hive [IMPORT-EXPORT](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport):
+[Hive ReplicationV2](https://cwiki.apache.org/confluence/display/Hive/HiveReplicationv2Development) (also called `Replv2`) has the following advantages over the first version of Hive replication that used Hive [IMPORT-EXPORT](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport):
 
 - Event-based incremental replication
 - Point-in-time replication  
@@ -74,7 +74,7 @@ repl load tpcds_orc from '/tmp/hive/repl/38896729-67d5-41b2-90dc-46eeed4c5dd0';
 
 ### Output the last replicated event ID
 
-The `REPL STATUS [database name]` command is executed on target clusters and outputs the last replicated `event_id`. The command also enables users to know what state their target cluster is been replicated to. You can use the output of this command to construct the next `REPL DUMP` command for incremental replication.
+The `REPL STATUS [database name]` command is executed on target clusters and outputs the last replicated `event_id`. The command also enables users to know what state their target cluster replicated to. You can use the output of this command to construct the next `REPL DUMP` command for incremental replication.
 
 ```sql
 repl status tpcds_orc;
diff --git a/articles/hdinsight/interactive-query/llap-schedule-based-autoscale-best-practices.md b/articles/hdinsight/interactive-query/llap-schedule-based-autoscale-best-practices.md
@@ -8,7 +8,7 @@ ms.author: sairamyeturi
 ms.date: 06/14/2024
 ---
 
-# Azure HDInsight interactive query cluster (Hive LLAP) schedule based autoscale
+# Azure HDInsight interactive query cluster (Hive LLAP) `schedule based autoscale`
 
 This document provides the onboarding steps to enable schedule-based autoscale for Interactive Query (LLAP) Cluster type in Azure HDInsight. It includes some of the best practices to operate Autoscale in Hive-LLAP.