MicrosoftDocs
diff --git a/‎articles/hdinsight/hadoop/apache-hadoop-on-premises-migration-best-practices-architecture.md
Lines changed: 1 addition & 1 deletion b/‎articles/hdinsight/hadoop/apache-hadoop-on-premises-migration-best-practices-architecture.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/hdinsight/hadoop/hdinsight-use-hive.md
Lines changed: 6 additions & 6 deletions b/‎articles/hdinsight/hadoop/hdinsight-use-hive.md
Lines changed: 6 additions & 6 deletions
diff --git a/‎articles/hdinsight/hdinsight-hadoop-templeton-webhcat-debug-errors.md
Lines changed: 5 additions & 7 deletions b/‎articles/hdinsight/hdinsight-hadoop-templeton-webhcat-debug-errors.md
Lines changed: 5 additions & 7 deletions
diff --git a/‎articles/hdinsight/hdinsight-log-management.md
Lines changed: 11 additions & 12 deletions b/‎articles/hdinsight/hdinsight-log-management.md
Lines changed: 11 additions & 12 deletions
@@ -99,7 +99,7 @@ Some HDInsight Hive metastore best practices are as follows:
 
 ## Best practices for different workloads
 
-- Consider using LLAP cluster for interactive Hive queries with improved response time [LLAP](https://cwiki.apache.org/confluence/display/Hive/LLAP) is a new feature in Hive 2.0 that allows in-memory caching of queries. LLAP makes Hive queries much faster, up to [26x faster than Hive 1.x in some cases](https://hortonworks.com/blog/announcing-apache-hive-2-1-25x-faster-queries-much/).
+- Consider using LLAP cluster for interactive Hive queries with improved response time [LLAP](https://cwiki.apache.org/confluence/display/Hive/LLAP) is a new feature in Hive 2.0 that allows in-memory caching of queries.
 - Consider using Spark jobs in place of Hive jobs.
 - Consider replacing impala-based queries with LLAP queries.
 - Consider replacing MapReduce jobs with Spark jobs.
 
@@ -4,7 +4,7 @@ description: Apache Hive is a data warehouse system for Apache Hadoop. You can q
 ms.service: hdinsight
 ms.topic: how-to
 ms.custom: hdinsightactive,hdiseo17may2017
-ms.date: 03/31/2022
+ms.date: 12/09/2022
 ---
 
 # What is Apache Hive and HiveQL on Azure HDInsight?
@@ -28,11 +28,11 @@ Use the following table to discover the different ways to use Hive with HDInsigh
 
 | **Use this method** if you want... | ...**interactive** queries | ...**batch** processing | ...from this **client operating system** |
 |:--- |:---:|:---:|:--- |:--- |
-| [HDInsight tools for Visual Studio Code](../hdinsight-for-vscode.md) |✔ |✔ | Linux, Unix, Mac OS X, or Windows |
+| [HDInsight tools for Visual Studio Code](../hdinsight-for-vscode.md) |✔ |✔ | Linux, Unix, macOS X, or Windows |
 | [HDInsight tools for Visual Studio](../hadoop/apache-hadoop-use-hive-visual-studio.md) |✔ |✔ |Windows |
 | [Hive View](../hadoop/apache-hadoop-use-hive-ambari-view.md) |✔ |✔ |Any (browser based) |
-| [Beeline client](../hadoop/apache-hadoop-use-hive-beeline.md) |✔ |✔ |Linux, Unix, Mac OS X, or Windows |
-| [REST API](../hadoop/apache-hadoop-use-hive-curl.md) |&nbsp; |✔ |Linux, Unix, Mac OS X, or Windows |
+| [Beeline client](../hadoop/apache-hadoop-use-hive-beeline.md) |✔ |✔ |Linux, Unix, macOS X, or Windows |
+| [REST API](../hadoop/apache-hadoop-use-hive-curl.md) |&nbsp; |✔ |Linux, Unix, macOS X, or Windows |
 | [Windows PowerShell](../hadoop/apache-hadoop-use-hive-powershell.md) |&nbsp; |✔ |Windows |
 
 ## HiveQL language reference
@@ -66,7 +66,7 @@ There are two types of tables that you can create with Hive:
 
 * __Internal__: Data is stored in the Hive data warehouse. The data warehouse is located at `/hive/warehouse/` on the default storage for the cluster.
 
-    Use internal tables when one of the following conditions apply:
+    Use internal tables when one of the following conditions applies:
 
     * Data is temporary.
     * You want Hive to manage the lifecycle of the table and data.
@@ -173,7 +173,7 @@ These statements perform the following actions:
 
 ### Low Latency Analytical Processing (LLAP)
 
-[LLAP](https://cwiki.apache.org/confluence/display/Hive/LLAP) (sometimes known as Live Long and Process) is a new feature in Hive 2.0 that allows in-memory caching of queries. LLAP makes Hive queries much faster, up to [26x faster than Hive 1.x in some cases](https://hortonworks.com/blog/announcing-apache-hive-2-1-25x-faster-queries-much/).
+[LLAP](https://cwiki.apache.org/confluence/display/Hive/LLAP) (sometimes known as Live Long and Process) is a new feature in Hive 2.0 that allows in-memory caching of queries.
 
 HDInsight provides LLAP in the Interactive Query cluster type. For more information, see the [Start with Interactive Query](../interactive-query/apache-interactive-query-get-started.md) document.
 
 
@@ -4,7 +4,7 @@ description: Learn how to about common errors returned by WebHCat on HDInsight a
 ms.service: hdinsight
 ms.topic: troubleshooting
 ms.custom: hdinsightactive
-ms.date: 04/14/2020
+ms.date: 12/07/2022
 ---
 
 # Understand and resolve errors received from WebHCat on HDInsight
@@ -27,7 +27,7 @@ If the following default values are exceeded, it can degrade WebHCat performance
 | --- | --- | --- |
 | [yarn.scheduler.capacity.maximum-applications][maximum-applications] |The maximum number of jobs that can be active concurrently (pending or running) |10,000 |
 | [templeton.exec.max-procs][max-procs] |The maximum number of requests that can be served concurrently |20 |
-| [mapreduce.jobhistory.max-age-ms][max-age-ms] |The number of days that job history are retained |7 days |
+| [mapreduce.jobhistory.max-age-ms][max-age-ms] |The number of days that job history are retained |seven days |
 
 ## Too many requests
 
@@ -45,13 +45,13 @@ If the following default values are exceeded, it can degrade WebHCat performance
 | --- | --- |
 | This status code usually occurs during failover between the primary and secondary HeadNode for the cluster |Wait two minutes, then retry the operation |
 
-## Bad request Content: Could not find job
+## Bad request Content: Couldn't find job
 
 **HTTP Status code**: 400
 
 | Cause | Resolution |
 | --- | --- |
-| Job details have been cleaned up by the job history cleaner |The default retention period for job history is 7 days. The default retention period can be changed by modifying `mapreduce.jobhistory.max-age-ms`. For more information, see [Modifying configuration](#modifying-configuration) |
+| Job details have been cleaned up by the job history cleaner |The default retention period for job history is seven days. The default retention period can be changed by modifying `mapreduce.jobhistory.max-age-ms`. For more information, see [Modifying configuration](#modifying-configuration) |
 | Job has been killed because of a failover |Retry job submission for up to two minutes |
 | An Invalid job ID was used |Check if the job ID is correct |
 
@@ -62,7 +62,7 @@ If the following default values are exceeded, it can degrade WebHCat performance
 | Cause | Resolution |
 | --- | --- |
 | Internal garbage collection is occurring within the WebHCat process |Wait for garbage collection to finish or restart the WebHCat service |
-| Time out waiting on a response from the ResourceManager service. This error can occur when the number of active applications goes the configured maximum (default 10,000) |Wait for currently running jobs to complete or increase the concurrent job limit by modifying `yarn.scheduler.capacity.maximum-applications`. For more information, see the [Modifying configuration](#modifying-configuration) section. |
+| Time out waiting on a response from the Resource Manager service. This error can occur when the number of active applications goes the configured maximum (default 10,000) |Wait for currently running jobs to complete or increase the concurrent job limit by modifying `yarn.scheduler.capacity.maximum-applications`. For more information, see the [Modifying configuration](#modifying-configuration) section. |
 | Attempting to retrieve all jobs through the [GET /jobs](https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Jobs) call while `Fields` is set to `*` |Don't retrieve *all* job details. Instead use `jobid` to retrieve details for jobs only greater than certain job ID. Or, don't use `Fields` |
 | The WebHCat service is down during HeadNode failover |Wait for two minutes and retry the operation |
 | There are more than 500 pending jobs submitted through WebHCat |Wait until currently pending jobs have completed before submitting more jobs |
@@ -71,6 +71,4 @@ If the following default values are exceeded, it can degrade WebHCat performance
 
 [!INCLUDE [troubleshooting next steps](includes/hdinsight-troubleshooting-next-steps.md)]
 
-[maximum-applications]: https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.1.3/bk_system-admin-guide/content/setting_application_limits.html
 [max-procs]: https://cwiki.apache.org/confluence/display/Hive/WebHCat+Configure#WebHCatConfigure-WebHCatConfiguration
-[max-age-ms]: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/ds_Hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
@@ -4,12 +4,12 @@ description: Determine the types, sizes, and retention policies for HDInsight ac
 ms.service: hdinsight
 ms.topic: how-to
 ms.custom: hdinsightactive
-ms.date: 04/28/2022
+ms.date: 12/07/2022
 ---
 
 # Manage logs for an HDInsight cluster
 
-An HDInsight cluster produces a variety of log files. For example, Apache Hadoop and related services, such as Apache Spark, produce detailed job execution logs. Log file management is part of maintaining a healthy HDInsight cluster. There can also be regulatory requirements for log archiving.  Due to the number and size of log files, optimizing log storage and archiving helps with service cost management.
+An HDInsight cluster produces variois log files. For example, Apache Hadoop and related services, such as Apache Spark, produce detailed job execution logs. Log file management is part of maintaining a healthy HDInsight cluster. There can also be regulatory requirements for log archiving.  Due to the number and size of log files, optimizing log storage and archiving helps with service cost management.
 
 Managing HDInsight cluster logs includes retaining information about all aspects of the cluster environment. This information includes all associated Azure Service logs, cluster configuration, job execution information, any error states, and other data as needed.
 
@@ -57,7 +57,7 @@ It's important to understand the workload types running on your HDInsight cluste
 
 * Consider maintaining data lineage tracking by adding an identifier to each log entry, or through other techniques. This allows you to trace back the original source of the data and the operation, and follow the data through each stage to understand its consistency and validity.
 
-* Consider how you can collect logs from the cluster, or from more than one cluster, and collate them for purposes such as auditing, monitoring, planning, and alerting. You might use a custom solution to access and download the log files on a regular basis, and combine and analyze them to provide a dashboard display. You can also add additional capabilities for alerting for security or failure detection. You can build these utilities using PowerShell, the HDInsight SDKs, or code that accesses the Azure classic deployment model.
+* Consider how you can collect logs from the cluster, or from more than one cluster, and collate them for purposes such as auditing, monitoring, planning, and alerting. You might use a custom solution to access and download the log files regularly, and combine and analyze them to provide a dashboard display. You can also add  other capabilities for alerting for security or failure detection. You can build these utilities using PowerShell, the HDInsight SDKs, or code that accesses the Azure classic deployment model.
 
 * Consider whether a monitoring solution or service would be a useful benefit. The Microsoft System Center provides an [HDInsight management pack](https://systemcenter.wiki/?Get_ManagementPackBundle=Microsoft.HDInsight.mpb&FileMD5=10C7D975C6096FFAA22C84626D211259). You can also use third-party tools such as Apache Chukwa and Ganglia to collect and centralize logs. Many companies offer services to monitor Hadoop-based big data solutions, for example: Centerity, Compuware APM, Sematext SPM, and Zettaset Orchestrator.
 
@@ -79,7 +79,7 @@ Using the Ambari UI, you can download the configuration for any (or all) service
 
 ### View the script action logs
 
-HDInsight [script actions](hdinsight-hadoop-customize-cluster-linux.md) run scripts on a cluster, either manually or when specified. For example, script actions can be used to install additional software on the cluster or to alter configuration settings from the default values. Script action logs can provide insight into errors that occurred during setup of the cluster, and also configuration settings' changes that could affect cluster performance and availability.  To see the status of a script action, select the **ops** button on your Ambari UI, or access the status logs in the default storage account. The storage logs are available at `/STORAGE_ACCOUNT_NAME/DEFAULT_CONTAINER_NAME/custom-scriptaction-logs/CLUSTER_NAME/DATE`.
+HDInsight [script actions](hdinsight-hadoop-customize-cluster-linux.md) run scripts on a cluster, either manually or when specified. For example, script actions can be used to install other software on the cluster or to alter configuration settings from the default values. Script action logs can provide insight into errors that occurred during setup of the cluster, and also configuration settings' changes that could affect cluster performance and availability.  To see the status of a script action, select the **ops** button on your Ambari UI, or access the status logs in the default storage account. The storage logs are available at `/STORAGE_ACCOUNT_NAME/DEFAULT_CONTAINER_NAME/custom-scriptaction-logs/CLUSTER_NAME/DATE`.
 
 ### View Ambari alerts status logs
 
@@ -130,13 +130,13 @@ yarn logs -applicationId <applicationId> -appOwner <user-who-started-the-applica
 yarn logs -applicationId <applicationId> -appOwner <user-who-started-the-application> -containerId <containerId> -nodeAddress <worker-node-address>
 ```
 
-#### YARN ResourceManager UI
+#### YARN Resource Manager UI
 
-The YARN ResourceManager UI runs on the cluster head node, and is accessed through the Ambari web UI. Use the following steps to view the YARN logs:
+The YARN Resource Manager UI runs on the cluster head node, and is accessed through the Ambari web UI. Use the following steps to view the YARN logs:
 
 1. In a web browser, navigate to `https://CLUSTERNAME.azurehdinsight.net`. Replace CLUSTERNAME with the name of your HDInsight cluster.
 2. From the list of services on the left, select YARN.
-3. From the Quick Links dropdown, select one of the cluster head nodes and then select **ResourceManager logs**. You're presented with a list of links to YARN logs.
+3. From the Quick Links dropdown, select one of the cluster head nodes and then select **Resource Manager logs**. You're presented with a list of links to YARN logs.
 
 ## Step 4: Forecast log volume storage sizes and costs
 
@@ -158,26 +158,25 @@ Alternatively, you can script log archiving with PowerShell.  For an example Pow
 
 ### Accessing Azure Storage metrics
 
-Azure Storage can be configured to log storage operations and access. You can use these very detailed logs for capacity monitoring and planning, and for auditing requests to storage. The logged information includes latency details, enabling you to monitor and fine-tune the performance of your solutions.
+Azure Storage can be configured to log storage operations and access. You can use these detailed logs for capacity monitoring and planning, and for auditing requests to storage. The logged information includes latency details, enabling you to monitor and fine-tune the performance of your solutions.
 You can use the .NET SDK for Hadoop to examine the log files generated for the Azure Storage that holds the data for an HDInsight cluster.
 
 ### Control the size and number of backup indexes for old log files
 
 To control the size and number of log files retained, set the following properties of the `RollingFileAppender`:
 
-* `maxFileSize` is the critical size of the file, above which the file is rolled. The default value is 10 MB.
+* `maxFileSize` is the critical size of the file, which the file is rolled. The default value is 10 MB.
 * `maxBackupIndex` specifies the number of backup files to be created, default 1.
 
 ### Other log management techniques
 
-To avoid running out of disk space, you can use some OS tools such as [logrotate](https://linux.die.net/man/8/logrotate) to manage handling of log files. You can configure `logrotate` to run on a daily basis, compressing log files and removing old ones. Your approach  depends on your requirements, such as how long to keep the logfiles on local nodes.  
+To avoid running out of disk space, you can use some OS tools such as [logrotate](https://linux.die.net/man/8/logrotate) to manage to handle of log files. You can configure `logrotate` to run on a daily basis, compressing log files and removing old ones. Your approach  depends on your requirements, such as how long to keep the logfiles on local nodes.  
 
-You can also check whether DEBUG logging is enabled for one or more services, which greatly increases the output log size.  
+You can also check whether DEBUG logging is enabled for one or more services, which greatly increase the output log size.  
 
 To collect the logs from all the nodes to one central location, you can create a data flow, such as ingesting all log entries into Solr.
 
 ## Next steps
 
 * [Monitoring and Logging Practice for HDInsight](/previous-versions/msp-n-p/dn749790(v=pandp.10))
 * [Access Apache Hadoop YARN application logs in Linux-based HDInsight](hdinsight-hadoop-access-yarn-app-logs-linux.md)
-* [How to control size of log files for various Apache Hadoop components](https://community.hortonworks.com/articles/8882/how-to-control-size-of-log-files-for-various-hdp-c.html)