You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Troubleshoot Apache HBase by using Azure HDInsight
13
13
14
14
Learn about the top issues and their resolutions when working with Apache HBase payloads in Apache Ambari.
15
15
16
-
## How do I run hbck command reports with multiple unassigned regions?
16
+
## How do I fix JDBC or SQLLine connectivity issues with Apache Phoenix?
17
17
18
-
A common error message that you might see when you run the `hbase hbck` command is "multiple regions being unassigned or holes in the chain of regions."
18
+
### Resolution steps
19
+
20
+
To connect with Apache Phoenix, you must provide the IP address of an active Apache ZooKeeper node. Ensure that the ZooKeeper service to which sqlline.py is trying to connect is up and running.
21
+
1. Sign in to the HDInsight cluster by using SSH.
22
+
2. Enter the following command:
23
+
24
+
```apache
25
+
"/usr/hdp/current/phoenix-client/bin/sqlline.py <IP of machine where Active Zookeeper is running"
26
+
```
27
+
28
+
> [!Note]
29
+
> You can get the IP address of the active ZooKeeper node from the Ambari UI. Go to **HBase** > **Quick Links** > **ZK\* (Active)** > **Zookeeper Info**.
30
+
31
+
3. If the sqlline.py connects to Phoenix and does not timeout, run the following command to validate the availability and health of Phoenix:
32
+
33
+
```apache
34
+
!tables
35
+
!quit
36
+
```
37
+
4. If this command works, there is no issue. The IP address provided by the user might be incorrect. However, if the command pauses for an extended time and then displays the following error, continue to step 5.
19
38
20
-
In the HBase Master UI, you can see the number of regions that are unbalanced across all region servers. Then, you can run `hbase hbck` command to see holes in the region chain.
39
+
```apache
40
+
Error while connecting to sqlline.py (Hbase - phoenix) Setting property: [isolation, TRANSACTION_READ_COMMITTED] issuing: !connect jdbc:phoenix:10.2.0.7 none none org.apache.phoenix.jdbc.PhoenixDriver Connecting to jdbc:phoenix:10.2.0.7 SLF4J: Class path contains multiple SLF4J bindings.
41
+
```
21
42
22
-
Holes might be caused by the offline regions, so fix the assignments first.
43
+
5. Run the following commands from the head node (hn0) to diagnose the condition of the Phoenix SYSTEM.CATALOG table:
23
44
24
-
To bring the unassigned regions back to a normal state, complete the following steps:
45
+
```apache
46
+
hbase shell
47
+
48
+
count 'SYSTEM.CATALOG'
49
+
```
25
50
26
-
1. Sign in to the HDInsight HBase cluster by using SSH.
27
-
2. To connect with the Apache ZooKeeper shell, run the `hbase zkcli` command.
28
-
3. Run the `rmr /hbase/regions-in-transition` command or the `rmr /hbase-unsecure/regions-in-transition` command.
29
-
4. To exit from the `hbase zkcli` shell, use the `exit` command.
30
-
5. Open the Apache Ambari UI, and then restart the Active HBase Master service.
31
-
6. Run the `hbase hbck` command again (without any options). Check the output of this command to ensure that all regions are being assigned.
51
+
The command should return an error similar to the following:
32
52
53
+
```apache
54
+
ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region SYSTEM.CATALOG,,1485464083256.c0568c94033870c517ed36c45da98129. is not online on 10.2.0.5,16020,1489466172189)
55
+
```
56
+
6. In the Apache Ambari UI, complete the following steps to restart the HMaster service on all ZooKeeper nodes:
33
57
34
-
## <aname="how-do-i-fix-timeout-issues-with-hbck-commands-for-region-assignments"></a>How do I fix timeout issues when using hbck commands for region assignments?
58
+
1. In the **Summary** section of HBase, go to **HBase** > **Active HBase Master**.
59
+
2. In the **Components** section, restart the HBase Master service.
60
+
3. Repeat these steps for all remaining **Standby HBase Master** services.
61
+
62
+
It can take up to five minutes for the HBase Master service to stabilize and finish the recovery process. After a few minutes, repeat the sqlline.py commands to confirm that the SYSTEM.CATALOG table is up, and that it can be queried.
63
+
64
+
When the SYSTEM.CATALOG table is back to normal, the connectivity issue to Phoenix should be automatically resolved.
65
+
66
+
## What causes a restart failure on a region server?
35
67
36
68
### Issue
37
69
38
-
A potential cause for timeout issues when you use the `hbck` command might be that several regions are in the "in transition" state for a long time. You can see those regions as offline in the HBase Master UI. Because a high number of regions are attempting to transition, HBase Master might timeout and be unable to bring those regions back online.
70
+
A restart failure on a region server might be prevented by following best practices. We recommend that you pause heavy workload activity when you are planning to restart HBase region servers. If an application continues to connect with region servers when shutdown is in progress, the region server restart operation will be slower by several minutes. Also, it's a good idea to first flush all the tables. For a reference on how to flush tables, see [HDInsight HBase: How to improve the Apache HBase cluster restart time by flushing tables](https://web.archive.org/web/20190112153155/https://blogs.msdn.microsoft.com/azuredatalake/2016/09/19/hdinsight-hbase-how-to-improve-hbase-cluster-restart-time-by-flushing-tables/).
71
+
72
+
If you initiate the restart operation on HBase region servers from the Apache Ambari UI, you immediately see that the region servers went down, but they don't restart right away.
73
+
74
+
Here's what's happening behind the scenes:
75
+
76
+
1. The Ambari agent sends a stop request to the region server.
77
+
2. The Ambari agent waits for 30 seconds for the region server to shut down gracefully.
78
+
3. If your application continues to connect with the region server, the server won't shut down immediately. The 30-second timeout expires before shutdown occurs.
79
+
4. After 30 seconds, the Ambari agent sends a force-kill (`kill -9`) command to the region server. You can see this in the ambari-agent log (in the /var/log/ directory of the respective worker node):
Because of the abrupt shutdown, the port associated with the process might not be released, even though the region server process is stopped. This situation can lead to an AddressBindException when the region server is starting, as shown in the following logs. You can verify this in the region-server.log in the /var/log/hbase directory on the worker nodes where the region server fails to start.
94
+
95
+
```apache
96
+
97
+
2017-03-21 13:25:47,061 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting
98
+
java.lang.RuntimeException: Failed construction of Regionserver: class org.apache.hadoop.hbase.regionserver.HRegionServer
99
+
at org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:2636)
100
+
at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:64)
101
+
at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87)
102
+
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
103
+
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
104
+
at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2651)
Copy file name to clipboardExpand all lines: articles/hdinsight/hbase/hbase-troubleshoot-timeouts-hbase-hbck.md
+10-8Lines changed: 10 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ ms.service: hdinsight
5
5
ms.topic: troubleshooting
6
6
author: hrasheed-msft
7
7
ms.author: hrasheed
8
-
ms.date: 08/01/2019
8
+
ms.date: 08/16/2019
9
9
---
10
10
11
11
# Scenario: Timeouts with 'hbase hbck' command in Azure HDInsight
@@ -18,28 +18,30 @@ Encounter timeouts with `hbase hbck` command when fixing region assignments.
18
18
19
19
## Cause
20
20
21
-
The potential cause here could be several regions under "in transition" state for a long time. Those regions can be seen as offline from Apache HBase Master UI. Due to high number of regions that are attempting to transition, HBase Master could time out and will be unable to bring those regions back to online state.
21
+
A potential cause for timeout issues when you use the `hbck` command might be that several regions are in the "in transition" state for a long time. You can see those regions as offline in the HBase Master UI. Because a high number of regions are attempting to transition, HBase Master might time out and be unable to bring those regions back online.
22
22
23
23
## Resolution
24
24
25
-
1. Sign in to HDInsight HBase cluster using SSH.
25
+
1. Sign in to the HDInsight HBase cluster using SSH.
26
26
27
-
1. Run `hbase zkcli` command to connect with zookeeper shell.
27
+
1. Run `hbase zkcli` command to connect with Apache ZooKeeper shell.
28
28
29
29
1. Run `rmr /hbase/regions-in-transition` or `rmr /hbase-unsecure/regions-in-transition` command.
30
30
31
31
1. Exit from `hbase zkcli` shell by using `exit` command.
32
32
33
-
1. Open Ambari UI and restart Active HBase Master service from Ambari.
33
+
1. From the Apache Ambari UI, restart the Active HBase Master service.
34
+
35
+
1. Run the `hbase hbck -fixAssignments` command.
34
36
35
37
1. Monitor the HBase Master UI "region in transition" that section to make sure no region gets stuck.
36
38
37
39
## Next steps
38
40
39
41
If you didn't see your problem or are unable to solve your issue, visit one of the following channels for more support:
40
42
41
-
* Get answers from Azure experts through [Azure Community Support](https://azure.microsoft.com/support/community/).
43
+
- Get answers from Azure experts through [Azure Community Support](https://azure.microsoft.com/support/community/).
42
44
43
-
* Connect with [@AzureSupport](https://twitter.com/azuresupport) - the official Microsoft Azure account for improving customer experience by connecting the Azure community to the right resources: answers, support, and experts.
45
+
- Connect with [@AzureSupport](https://twitter.com/azuresupport) - the official Microsoft Azure account for improving customer experience. Connecting the Azure community to the right resources: answers, support, and experts.
44
46
45
-
* If you need more help, you can submit a support request from the [Azure portal](https://portal.azure.com/?#blade/Microsoft_Azure_Support/HelpAndSupportBlade/). Select **Support** from the menu bar or open the **Help + support** hub. For more detailed information, please review [How to create an Azure support request](https://docs.microsoft.com/azure/azure-supportability/how-to-create-azure-support-request). Access to Subscription Management and billing support is included with your Microsoft Azure subscription, and Technical Support is provided through one of the [Azure Support Plans](https://azure.microsoft.com/support/plans/).
47
+
- If you need more help, you can submit a support request from the [Azure portal](https://portal.azure.com/?#blade/Microsoft_Azure_Support/HelpAndSupportBlade/). Select **Support** from the menu bar or open the **Help + support** hub. For more detailed information, review [How to create an Azure support request](https://docs.microsoft.com/azure/azure-supportability/how-to-create-azure-support-request). Access to Subscription Management and billing support is included with your Microsoft Azure subscription, and Technical Support is provided through one of the [Azure Support Plans](https://azure.microsoft.com/support/plans/).
Copy file name to clipboardExpand all lines: articles/hdinsight/hbase/hbase-troubleshoot-unassigned-regions.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ ms.service: hdinsight
5
5
ms.topic: troubleshooting
6
6
author: hrasheed-msft
7
7
ms.author: hrasheed
8
-
ms.date: 08/07/2019
8
+
ms.date: 08/16/2019
9
9
---
10
10
11
11
# Issues with region servers in Azure HDInsight
@@ -22,7 +22,7 @@ When running `hbase hbck` command, you see an error message similar to:
22
22
multiple regions being unassigned or holes in the chain of regions
23
23
```
24
24
25
-
From the Apache HBase Master UI, it can be seen that the count of regions being unbalanced across all the region servers.
25
+
From the Apache HBase Master UI, you can see the number of regions that are unbalanced across all region servers. Then, you can run `hbase hbck` command to see holes in the region chain.
26
26
27
27
### Cause
28
28
@@ -32,15 +32,15 @@ Holes may be the result of offline regions.
32
32
33
33
Fix the assignments. Follow the steps below to bring the unassigned regions back to normal state:
34
34
35
-
1. Sign in to HDInsight HBase cluster using SSH.
35
+
1. Sign in to the HDInsight HBase cluster using SSH.
36
36
37
-
1. Run `hbase zkcli` command to connect with zookeeper shell.
37
+
1. Run `hbase zkcli` command to connect with ZooKeeper shell.
38
38
39
39
1. Run `rmr /hbase/regions-in-transition` or `rmr /hbase-unsecure/regions-in-transition` command.
40
40
41
41
1. Exit zookeeper shell by using `exit` command.
42
42
43
-
1. Open Ambari UI and restart Active HBase Master service from Ambari.
43
+
1. Open the Apache Ambari UI, and then restart the Active HBase Master service.
44
44
45
45
1. Run `hbase hbck` command again (without any further options). Check the output and ensure that all regions are being assigned.
46
46
@@ -56,7 +56,7 @@ Region servers fail to start.
56
56
57
57
Multiple splitting WAL directories.
58
58
59
-
1. Get list of current wals: `hadoop fs -ls -R /hbase/WALs/ > /tmp/wals.out`.
59
+
1. Get list of current WALs: `hadoop fs -ls -R /hbase/WALs/ > /tmp/wals.out`.
60
60
61
61
1. Inspect the `wals.out` file. If there are too many splitting directories (starting with *-splitting), the region server is probably failing because of these directories.
0 commit comments