Merge pull request #85432 from dagiro/ts_hbase12

Jak-MS · web-flow · commit 6163c2f7797e · 2019-08-15T12:24:59.000-05:00
ts_hbase12
diff --git a/articles/hdinsight/hbase/apache-troubleshoot-hbase.md b/articles/hdinsight/hbase/apache-troubleshoot-hbase.md
@@ -96,89 +96,6 @@ It can take up to five minutes for the HBase Master service to stabilize and fin
 
 When the SYSTEM.CATALOG table is back to normal, the connectivity issue to Phoenix should be automatically resolved.
 
-
-## What causes a master server to fail to start?
-
-### Error 
-
-An atomic renaming failure occurs.
-
-### Detailed description
-
-During the startup process, HMaster completes many initialization steps. These include moving data from the scratch (.tmp) folder to the data folder. HMaster also looks at the write-ahead logs (WALs) folder to see if there are any unresponsive region servers, and so on. 
-
-During startup, HMaster does a basic `list` command on these folders. If at any time, HMaster sees an unexpected file in any of these folders, it throws an exception and doesn't start.  
-
-### Probable cause
-
-In the region server logs, try to identify the timeline of the file creation, and then see if there was a process crash around the time the file was created. (Contact HBase support to assist you in doing this.) This helps us provide more robust mechanisms, so that you can avoid hitting this bug, and ensure graceful process shutdowns.
-
-### Resolution steps
-
-Check the call stack and try to determine which folder might be causing the problem (for instance, it might be the WALs folder or the .tmp folder). Then, in Cloud Explorer or by using HDFS commands, try to locate the problem file. Usually, this is a \*-renamePending.json file. (The \*-renamePending.json file is a journal file that's used to implement the atomic rename operation in the WASB driver. Due to bugs in this implementation, these files can be left over after process crashes, and so on.) Force-delete this file either in Cloud Explorer or by using HDFS commands. 
-
-Sometimes, there might also be a temporary file named something like *$$$.$$$* at this location. You have to use HDFS `ls` command to see this file; you cannot see the file in Cloud Explorer. To delete this file, use the HDFS command `hdfs dfs -rm /\<path>\/\$\$\$.\$\$\$`.  
-
-After you've run these commands, HMaster should start immediately. 
-
-### Error
-
-No server address is listed in *hbase: meta* for region xxx.
-
-### Detailed description
-
-You might see a message on your Linux cluster that indicates that the *hbase: meta* table is not online. Running `hbck` might report that "hbase: meta table replicaId 0 is not found on any region." The problem might be that HMaster could not initialize after you restarted HBase. In the HMaster logs, you might see the message: "No server address listed in hbase: meta for region hbase: backup \<region name\>".  
-
-### Resolution steps
-
-1. In the HBase shell, enter the following commands (change actual values as applicable):  
-
-   ```apache
-   > scan 'hbase:meta'  
-   ```
-
-   ```apache
-   > delete 'hbase:meta','hbase:backup <region name>','<column name>'  
-   ```
-
-2. Delete the *hbase: namespace* entry. This entry might be the same error that's being reported when the *hbase: namespace* table is scanned.
-
-3. To bring up HBase in a running state, in the Ambari UI, restart the Active HMaster service.  
-
-4. In the HBase shell, to bring up all offline tables, run the following command:
-
-   ```apache 
-   hbase hbck -ignorePreCheckPermission -fixAssignments 
-   ```
-
-### Additional reading
-
-[Unable to process the HBase table](https://stackoverflow.com/questions/4794092/unable-to-access-hbase-table)
-
-
-### Error
-
-HMaster times out with a fatal exception similar to "java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned."
-
-### Detailed description
-
-You might experience this issue if you have many tables and regions that have not been flushed when you restart your HMaster services. Restart might fail, and you'll see the preceding error message.  
-
-### Probable cause
-
-This is a known issue with the HMaster service. General cluster startup tasks can take a long time. HMaster shuts down because the namespace table isn’t yet assigned. This occurs only in scenarios in which large amount of unflushed data exists, and a timeout of five minutes is not sufficient.
-  
-### Resolution steps
-
-1. In the Apache Ambari UI, go to **HBase** > **Configs**. In the custom hbase-site.xml file, add the following setting: 
-
-   ```apache
-   Key: hbase.master.namespace.init.timeout Value: 2400000  
-   ```
-
-2. Restart the required services (HMaster, and possibly other HBase services).  
-
-
 ## What causes a restart failure on a region server?
 
 ### Issue
@@ -256,5 +173,12 @@ Here's what's happening behind the scenes:
    sudo su - hbase -c "/usr/hdp/current/hbase-regionserver/bin/hbase-daemon.sh start regionserver"   
    ```
 
-### See also
-[Troubleshoot by using Azure HDInsight](../../hdinsight/hdinsight-troubleshoot-guide.md)
+## Next steps
+
+If you didn't see your problem or are unable to solve your issue, visit one of the following channels for more support:
+
+* Get answers from Azure experts through [Azure Community Support](https://azure.microsoft.com/support/community/).
+
+* Connect with [@AzureSupport](https://twitter.com/azuresupport) - the official Microsoft Azure account for improving customer experience. Connecting the Azure community to the right resources: answers, support, and experts.
+
+* If you need more help, you can submit a support request from the [Azure portal](https://portal.azure.com/?#blade/Microsoft_Azure_Support/HelpAndSupportBlade/). Select **Support** from the menu bar or open the **Help + support** hub. For more detailed information, review [How to create an Azure support request](https://docs.microsoft.com/azure/azure-supportability/how-to-create-azure-support-request). Access to Subscription Management and billing support is included with your Microsoft Azure subscription, and Technical Support is provided through one of the [Azure Support Plans](https://azure.microsoft.com/support/plans/).
diff --git a/articles/hdinsight/hbase/hbase-troubleshoot-start-fails.md b/articles/hdinsight/hbase/hbase-troubleshoot-start-fails.md
@@ -5,7 +5,7 @@ ms.service: hdinsight
 ms.topic: troubleshooting
 author: hrasheed-msft
 ms.author: hrasheed
-ms.date: 08/06/2019
+ms.date: 08/14/2019
 ---
 
 # Apache HBase Master (HMaster) fails to start in Azure HDInsight
@@ -20,42 +20,46 @@ Unexpected files identified during startup process.
 
 ### Cause
 
-During the startup process, HMaster performs many initialization steps, including moving data from scratch (.tmp) folder to data folder. HMaster also looks at WALs (Write Ahead Logs) folder to see if there are any dead region servers. During all these situations, it does a basic `list` command on these folders. If at any time it sees an unexpected file in any of these folders, it will throw an exception and hence not start.
+During the startup process, HMaster performs many initialization steps, including moving data from scratch (.tmp) folder to data folder. HMaster also looks at the write-ahead logs (WAL) folder to see if there are any unresponsive region servers.
+
+HMaster does a basic list command on the WAL folders. If at any time, HMaster sees an unexpected file in any of these folders, it throws an exception and doesn't start.
 
 ### Resolution
 
-In such a situation, check the call stack to see which folder might be causing problem (for instance is it WALs folder or .tmp folder). Then via Cloud Explorer or via hdfs commands to locate the problem file. The problem file is usually a `*-renamePending.json` file (a journal file used to implement Atomic Rename operation in WASB driver). Due to bugs in this implementation, such files can be left over in cases of process crash. Force delete this file via Cloud Explorer. In addition, there might be a temporary file of the nature $ in this location. The file cannot be seen via cloud explorer and only via hdfs `ls` command. You can use hdfs command `hdfs dfs -rm //\$\$\$.\$\$\$` to delete this file.
+Check the call stack and try to determine which folder might be causing the problem (for instance, it might be the WAL folder or the .tmp folder). Then, in Cloud Explorer or by using HDFS commands, try to locate the problem file. Usually, this is a `*-renamePending.json` file. (The `*-renamePending.json` file is a journal file that's used to implement the atomic rename operation in the WASB driver. Due to bugs in this implementation, these files can be left over after process crashes, and so on.) Force-delete this file either in Cloud Explorer or by using HDFS commands.
+
+Sometimes, there might also be a temporary file named something like `$$$.$$$` at this location. You have to use HDFS `ls` command to see this file; you cannot see the file in Cloud Explorer. To delete this file, use the HDFS command `hdfs dfs -rm /\<path>\/\$\$\$.\$\$\$`.
 
-Once the problem file has been removed, HMaster should start up immediately.
+After you've run these commands, HMaster should start immediately.
 
 ---
 
 ## Scenario: No server address listed
 
 ### Issue
 
-HMaster log shows an error message similar to "No server address listed in hbase: meta for region xxx."
+You might see a message that indicates that the `hbase: meta` table is not online. Running `hbck` might report that `hbase: meta table replicaId 0 is not found on any region.` In the HMaster logs, you might see the message: `No server address listed in hbase: meta for region hbase: backup <region name>`.  
 
 ### Cause
 
 HMaster could not initialize after restarting HBase.
 
 ### Resolution
 
-1. Execute the following commands on HBase shell (change actual values as applicable):
+1. In the HBase shell, enter the following commands (change actual values as applicable):
 
-    ```
+    ```hbase
     scan 'hbase:meta'
-    delete 'hbase:meta','hbase:backup <region name>','<column name>' 
+    delete 'hbase:meta','hbase:backup <region name>','<column name>'
     ```
 
-1. Delete the entry of hbase: namespace as the same error may be reported while scan hbase: namespace table.
+1. Delete the `hbase: namespace` entry. This entry might be the same error that's being reported when the `hbase: namespace` table is scanned.
 
 1. Restart the active HMaster from Ambari UI to bring up HBase in running state.
 
-1. Run the following command on HBase shell to bring up all offline tables:
+1. In the HBase shell, to bring up all offline tables, run the following command:
 
-    ```
+    ```hbase
     hbase hbck -ignorePreCheckPermission -fixAssignments
     ```
 
@@ -65,29 +69,29 @@ HMaster could not initialize after restarting HBase.
 
 ### Issue
 
-HMaster times out with fatal exception like `java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned`.
+HMaster times out with fatal exception similar to: `java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned`.
 
 ### Cause
 
-The time-out is a known defect with HMaster. General cluster startup tasks can take a long time. HMaster shuts down if the namespace table isn’t yet assigned. The lengthy startup tasks happen where large amount of unflushed data exists and a timeout of five minutes is not sufficient.
+You might experience this issue if you have many tables and regions that have not been flushed when you restart your HMaster services. The time-out is a known defect with HMaster. General cluster startup tasks can take a long time. HMaster shuts down if the namespace table isn’t yet assigned. The lengthy startup tasks happen where large amount of unflushed data exists and a timeout of five minutes is not sufficient.
 
 ### Resolution
 
-1. Access Ambari UI, go to HBase -> Configs, in custom `hbase-site.xml` add the following setting:
+1. From the Apache Ambari UI, go to **HBase** > **Configs**. In the custom `hbase-site.xml` file, add the following setting:
 
     ```
     Key: hbase.master.namespace.init.timeout Value: 2400000  
     ```
 
-1. Restart required services (Mainly HMaster and possibly other HBase services).
+1. Restart the required services (HMaster, and possibly other HBase services).
 
 ---
 
-## Scenario: Frequent regionserver restarts
+## Scenario: Frequent region server restarts
 
 ### Issue
 
-Nodes reboot periodically. From the regionserver logs you may see entries similar to:
+Nodes reboot periodically. From the region server logs you may see entries similar to:
 
 ```
 2017-05-09 17:45:07,683 WARN  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 31000ms
@@ -97,15 +101,15 @@ Nodes reboot periodically. From the regionserver logs you may see entries simila
 
 ### Cause
 
-Long regionserver JVM GC pause. The pause will cause regionserver to be unresponsive and not able to send heart beat to HMaster within the zk session timeout 40s. HMaster will believe regionserver is dead and will abort the regionserver and restart.
+Long `regionserver` JVM GC pause. The pause will cause `regionserver` to be unresponsive and not able to send heart beat to HMaster within the zk session timeout 40s. HMaster will believe `regionserver` is dead and will abort the `regionserver` and restart.
 
 ### Resolution
 
-Change the zookeeper session timeout, not only hbase-site setting `zookeeper.session.timeout` but also zookeeper zoo.cfg setting `maxSessionTimeout` need to be changed.
+Change the Zookeeper session timeout, not only `hbase-site` setting `zookeeper.session.timeout` but also Zookeeper `zoo.cfg` setting `maxSessionTimeout` need to be changed.
 
 1. Access Ambari UI, go to **HBase -> Configs -> Settings**, in Timeouts section, change the value of Zookeeper Session Timeout.
 
-1. Access Ambari UI, go to **Zookeeper -> Configs -> Custom** zoo.cfg, add/change the following setting. Make sure the value is the same as hbase `zookeeper.session.timeout`.
+1. Access Ambari UI, go to **Zookeeper -> Configs -> Custom** `zoo.cfg`, add/change the following setting. Make sure the value is the same as HBase `zookeeper.session.timeout`.
 
     ```
     Key: maxSessionTimeout Value: 120000  
diff --git a/articles/hdinsight/hdinsight-troubleshoot-guide.md b/articles/hdinsight/hdinsight-troubleshoot-guide.md
@@ -4,16 +4,16 @@ description: Troubleshoot Apache Hadoop workloads by using Azure HDInsight. Step
 author: hrasheed-msft
 ms.author: hrasheed
 ms.service: hdinsight
-ms.topic: conceptual
-ms.date: 05/29/2019
+ms.topic: troubleshooting
+ms.date: 08/14/2019
 ---
 
 
 # Troubleshoot by using Azure HDInsight
 
 | Apache workload | Top questions |
 |---|---|
-|![HBase](./media/hdinsight-troubleshoot-guide/HBASE.png)<br>[Troubleshoot Apache HBase](hbase/apache-troubleshoot-hbase.md)|<br>[How do I run hbck command reports with multiple unassigned regions?](hbase/apache-troubleshoot-hbase.md#how-do-i-run-hbck-command-reports-with-multiple-unassigned-regions)<br><br>[How do I fix timeout issues when using hbck commands for region assignments?](hbase/apache-troubleshoot-hbase.md#how-do-i-fix-timeout-issues-with-hbck-commands-for-region-assignments)<br><br>[How do I fix JDBC or SQLLine connectivity issues with Apache Phoenix?](hbase/apache-troubleshoot-hbase.md#how-do-i-fix-jdbc-or-sqlline-connectivity-issues-with-apache-phoenix)<br><br>[What causes a master server to fail to start?](hbase/apache-troubleshoot-hbase.md#what-causes-a-master-server-to-fail-to-start)<br><br>[What causes a restart failure on a region server?](hbase/apache-troubleshoot-hbase.md#what-causes-a-restart-failure-on-a-region-server)|
+|![HBase](./media/hdinsight-troubleshoot-guide/HBASE.png)<br>[Troubleshoot Apache HBase](hbase/apache-troubleshoot-hbase.md)|<br>[How do I run hbck command reports with multiple unassigned regions?](hbase/apache-troubleshoot-hbase.md#how-do-i-run-hbck-command-reports-with-multiple-unassigned-regions)<br><br>[How do I fix timeout issues when using hbck commands for region assignments?](hbase/apache-troubleshoot-hbase.md#how-do-i-fix-timeout-issues-with-hbck-commands-for-region-assignments)<br><br>[How do I fix JDBC or SQLLine connectivity issues with Apache Phoenix?](hbase/apache-troubleshoot-hbase.md#how-do-i-fix-jdbc-or-sqlline-connectivity-issues-with-apache-phoenix)<br><br>[What causes a master server to fail to start?](hbase/hbase-troubleshoot-start-fails.md)<br><br>[What causes a restart failure on a region server?](hbase/apache-troubleshoot-hbase.md#what-causes-a-restart-failure-on-a-region-server)|
 |![HDFS](./media/hdinsight-troubleshoot-guide/HDFS.png)<br>[Troubleshoot Apache Hadoop HDFS](hdinsight-troubleshoot-hdfs.md)|<br>[How do I access a local HDFS from inside a cluster?](hdinsight-troubleshoot-hdfs.md#how-do-i-access-local-hdfs-from-inside-a-cluster)<br><br>[Local HDFS stuck in safe mode on Azure HDInsight cluster](hadoop/hdinsight-hdfs-troubleshoot-safe-mode.md)|
 |![Hive](./media/hdinsight-troubleshoot-guide/HIVE.png)<br>[Troubleshoot Apache Hive](hdinsight-troubleshoot-hive.md)|<br>[How do I export a Hive metastore and import it on another cluster?](hdinsight-troubleshoot-hive.md#how-do-i-export-a-hive-metastore-and-import-it-on-another-cluster)<br><br>[How do I locate Apache Hive logs on a cluster?](hdinsight-troubleshoot-hive.md#how-do-i-locate-hive-logs-on-a-cluster)<br><br>[How do I launch the Apache Hive shell with specific configurations on a cluster?](hdinsight-troubleshoot-hive.md#how-do-i-launch-the-hive-shell-with-specific-configurations-on-a-cluster)<br><br>[How do I analyze Apache Tez DAG data on a cluster-critical path?](hdinsight-troubleshoot-hive.md#how-do-i-analyze-tez-dag-data-on-a-cluster-critical-path)<br><br>[How do I download Apache Tez DAG data from a cluster?](hdinsight-troubleshoot-hive.md#how-do-i-download-tez-dag-data-from-a-cluster)|
 |![Spark](./media/hdinsight-troubleshoot-guide/SPARK.png)<br>[Troubleshoot Apache Spark](hdinsight-troubleshoot-SPARK.md)|<br>[How do I configure an Apache Spark application by using Apache Ambari on clusters?](spark/apache-troubleshoot-spark.md#how-do-i-configure-an-apache-spark-application-by-using-apache-ambari-on-clusters)<br><br>[How do I configure an Apache Spark application by using a Jupyter notebook on clusters?](spark/apache-troubleshoot-spark.md#how-do-i-configure-an-apache-spark-application-by-using-a-jupyter-notebook-on-clusters)<br><br>[How do I configure an Apache Spark application by using Apache Livy on clusters?](spark/apache-troubleshoot-spark.md#how-do-i-configure-an-apache-spark-application-by-using-apache-livy-on-clusters)<br><br>[How do I configure an Apache Spark application by using spark-submit on clusters?](spark/apache-troubleshoot-spark.md#how-do-i-configure-an-apache-spark-application-by-using-spark-submit-on-clusters)<br><br>[How do I configure an Apache Spark application by using IntelliJ?](spark/apache-spark-intellij-tool-plugin.md)<br><br>[How do I configure an Apache Spark application by using Eclipse?](spark/apache-spark-eclipse-tool-plugin.md)<br><br>[How do I configure an Apache Spark application by using VSCode?](hdinsight-for-vscode.md)<br><br>[What causes an Apache Spark application OutOfMemoryError exception?](spark/apache-troubleshoot-spark.md#what-causes-an-apache-spark-application-outofmemoryerror-exception)|