You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hbase/apache-hbase-tutorial-get-started-linux.md
+20-16Lines changed: 20 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ description: Follow this Apache HBase tutorial to start using hadoop on HDInsigh
4
4
ms.service: azure-hdinsight
5
5
ms.topic: tutorial
6
6
ms.custom: hdinsightactive, linux-related-content
7
-
ms.date: 05/10/2024
7
+
ms.date: 12/23/2024
8
8
---
9
9
10
10
# Tutorial: Use Apache HBase in Azure HDInsight
@@ -24,13 +24,13 @@ In this tutorial, you learn how to:
24
24
25
25
* An SSH client. For more information, see [Connect to HDInsight (Apache Hadoop) using SSH](../hdinsight-hadoop-linux-use-ssh-unix.md).
26
26
27
-
* Bash. The examples in this article use the Bash shell on Windows 10 for the curl commands. See [Windows Subsystem for Linux Installation Guide for Windows 10](/windows/wsl/install-win10) for installation steps. Other [Unix shells](https://www.gnu.org/software/bash/)will work as well. The curl examples, with some slight modifications, can work on a Windows Command prompt. Or you can use the Windows PowerShell cmdlet [Invoke-RestMethod](/powershell/module/microsoft.powershell.utility/invoke-restmethod).
27
+
* Bash. The examples in this article use the Bash shell on Windows 10 for the curl commands. See [Windows Subsystem for Linux Installation Guide for Windows 10](/windows/wsl/install-win10) for installation steps. Other [Unix shells](https://www.gnu.org/software/bash/) work as well. The curl examples, with some slight modifications, can work on a Windows Command prompt. Or you can use the Windows PowerShell cmdlet [Invoke-RestMethod](/powershell/module/microsoft.powershell.utility/invoke-restmethod).
28
28
29
29
## Create Apache HBase cluster
30
30
31
-
The following procedure uses an Azure Resource Manager template to create an HBase cluster. The template also creates the dependent default Azure Storage account. To understand the parameters used in the procedure and other cluster creation methods, see [Create Linux-based Hadoop clusters in HDInsight](../hdinsight-hadoop-provision-linux-clusters.md).
31
+
The following procedure uses an Azure Resource Manager template to create a HBase cluster. The template also creates the dependent default Azure Storage account. To understand the parameters used in the procedure and other cluster creation methods, see [Create Linux-based Hadoop clusters in HDInsight](../hdinsight-hadoop-provision-linux-clusters.md).
32
32
33
-
1. Select the following image to open the template in the Azure portal. The template is located in [Azure quickstart templates](https://azure.microsoft.com/resources/templates/).
33
+
1. Select the following image to open the template in the Azure portal. The template is located in [Azure Quickstart Templates](https://azure.microsoft.com/resources/templates/).
34
34
35
35
<ahref="https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fazure-quickstart-templates%2Fmaster%2Fquickstarts%2Fmicrosoft.hdinsight%2Fhdinsight-hbase-linux%2Fazuredeploy.json"target="_blank"><imgsrc="./media/apache-hbase-tutorial-get-started-linux/hdi-deploy-to-azure1.png"alt="Deploy to Azure button for new cluster"></a>
36
36
@@ -51,7 +51,7 @@ The following procedure uses an Azure Resource Manager template to create an HBa
51
51
52
52
3. Select **I agree to the terms and conditions stated above**, and then select **Purchase**. It takes about 20 minutes to create a cluster.
53
53
54
-
After an HBase cluster is deleted, you can create another HBase cluster by using the same default blob container. The new cluster picks up the HBase tables you created in the original cluster. To avoid inconsistencies, we recommend that you disable the HBase tables before you delete the cluster.
54
+
After a HBase cluster is deleted, you can create another HBase cluster by using the same default blob container. The new cluster picks up the HBase tables you created in the original cluster. To avoid inconsistencies, we recommend that you disable the HBase tables before you delete the cluster.
55
55
56
56
## Create tables and insert data
57
57
@@ -67,7 +67,7 @@ In HBase (an implementation of [Cloud BigTable](https://cloud.google.com/bigtabl
67
67
68
68
**To use the HBase shell**
69
69
70
-
1. Use `ssh` command to connect to your HBase cluster. Edit the command below by replacing `CLUSTERNAME` with the name of your cluster, and then enter the command:
70
+
1. Use `ssh` command to connect to your HBase cluster. Edit the following command by replacing `CLUSTERNAME` with the name of your cluster, and then enter the command:
@@ -79,7 +79,7 @@ In HBase (an implementation of [Cloud BigTable](https://cloud.google.com/bigtabl
79
79
hbase shell
80
80
```
81
81
82
-
1. Use `create` command to create an HBase table with two-column families. The table and column names are case-sensitive. Enter the following command:
82
+
1. Use `create` command to create a HBase table with two-column families. The table and column names are case-sensitive. Enter the following command:
83
83
84
84
```hbaseshell
85
85
create 'Contacts', 'Personal', 'Office'
@@ -204,23 +204,24 @@ You can query data in HBase tables by using [Apache Hive](https://hive.apache.or
204
204
The Hive query to access HBase data need not be executed from the HBase cluster. Any cluster that comes with Hive (including Spark, Hadoop, HBase, or Interactive Query) can be used to query HBase data, provided the following steps are completed:
205
205
206
206
1. Both clusters must be attached to the same Virtual Network and Subnet
207
-
2. Copy `/usr/hdp/$(hdp-select --version)/hbase/conf/hbase-site.xml` from the HBase cluster headnodes to the Hive cluster headnodes and workernodes.
207
+
2. Copy `/usr/hdp/$(hdp-select --version)/hbase/conf/hbase-site.xml` from the HBase cluster headnodes to the Hive cluster headnodes and worker nodes.
208
208
209
209
### Secure Clusters
210
210
211
211
HBase data can also be queried from Hive using ESP-enabled HBase:
212
212
213
213
1. When following a multi-cluster pattern, both clusters must be ESP-enabled.
214
214
2. To allow Hive to query HBase data, make sure that the `hive` user is granted permissions to access the HBase data via the Hbase Apache Ranger plugin
215
-
3. When using separate, ESP-enabled clusters, the contents of `/etc/hosts` from the HBase cluster headnodes must be appended to `/etc/hosts` of the Hive cluster headnodes and workernodes.
215
+
3. When you use separate, ESP-enabled clusters, the contents of `/etc/hosts` from the HBase cluster headnodes must be appended to `/etc/hosts` of the Hive cluster headnodes and worker nodes.
216
216
> [!NOTE]
217
217
> After scaling either clusters, `/etc/hosts` must be appended again
218
218
219
219
## Use the HBase REST API via Curl
220
220
221
221
The HBase REST API is secured via [basic authentication](https://en.wikipedia.org/wiki/Basic_access_authentication). You shall always make requests by using Secure HTTP (HTTPS) to help ensure that your credentials are securely sent to the server.
222
222
223
-
1. To enable the HBase REST API in the HDInsight cluster, add the following custom startup script to the **Script Action** section. You can add the startup script when you create the cluster or after the cluster has been created. For **Node Type**, select **Region Servers** to ensure that the script executes only in HBase Region Servers.
223
+
1. To enable the HBase REST API in the HDInsight cluster, add the following custom startup script to the **Script Action** section. You can add the startup script when you create the cluster or after the cluster has been created. For **Node Type**, select **Region Servers** to ensure that the script executes only in HBase Region Servers. Script starts HBase REST proxy on 8090 port on Region servers.
224
+
224
225
225
226
```bash
226
227
#! /bin/bash
@@ -244,7 +245,7 @@ The HBase REST API is secured via [basic authentication](https://en.wikipedia.or
244
245
fi
245
246
```
246
247
247
-
1. Set environment variable for ease of use. Edit the commands below by replacing `MYPASSWORD` with the cluster login password. Replace `MYCLUSTERNAME` with the name of your HBase cluster. Then enter the commands.
248
+
1. Set environment variable for ease of use. Edit the following commands by replacing `MYPASSWORD` with the cluster login password. Replace `MYCLUSTERNAME` with the name of your HBase cluster. Then enter the commands.
248
249
249
250
```bash
250
251
export PASSWORD='MYPASSWORD'
@@ -269,7 +270,7 @@ The HBase REST API is secured via [basic authentication](https://en.wikipedia.or
269
270
-v
270
271
```
271
272
272
-
The schema is provided in the JSon format.
273
+
The schema is provided in the JSON format.
273
274
1. Use the following command to insert some data:
274
275
275
276
```bash
@@ -298,12 +299,15 @@ The HBase REST API is secured via [basic authentication](https://en.wikipedia.or
298
299
-v
299
300
```
300
301
302
+
> [!NOTE]
303
+
> Scan through the cluster endpoint is not supported yet.
304
+
301
305
For more information about HBase Rest, see [Apache HBase Reference Guide](https://hbase.apache.org/book.html#_rest).
302
306
303
-
> [!NOTE]
307
+
> [!NOTE]
304
308
> Thrift is not supported by HBase in HDInsight.
305
309
>
306
-
> When using Curl or any other REST communication with WebHCat, you must authenticate the requests by providing the user name and password for the HDInsight cluster administrator. You must also use the cluster name as part of the Uniform Resource Identifier (URI) used to send the requests to the server:
310
+
> When you use Curl or any other REST communication with WebHCat, you must authenticate the requests by providing the user name and password for the HDInsight cluster administrator. You must also use the cluster name as part of the Uniform Resource Identifier (URI) used to send the requests to the server:
307
311
>
308
312
> `curl -u <UserName>:<Password> \`
309
313
>
@@ -337,7 +341,7 @@ HBase in HDInsight ships with a Web UI for monitoring clusters. Using the Web UI
337
341
338
342
## Cluster recreation
339
343
340
-
After an HBase cluster is deleted, you can create another HBase cluster by using the same default blob container. The new cluster picks up the HBase tables you created in the original cluster. To avoid inconsistencies, however, we recommend that you disable the HBase tables before you delete the cluster.
344
+
After a HBase cluster is deleted, you can create another HBase cluster by using the same default blob container. The new cluster picks up the HBase tables you created in the original cluster. To avoid inconsistencies, however, we recommend that you disable the HBase tables before you delete the cluster.
341
345
342
346
You can use the HBase command `disable 'Contacts'`.
343
347
@@ -353,7 +357,7 @@ If you're not going to continue to use this application, delete the HBase cluste
353
357
354
358
## Next steps
355
359
356
-
In this tutorial, you learned how to create an Apache HBase cluster. And how to create tables and view the data in those tables from the HBase shell. You also learned how to use a Hive query on data in HBase tables. And how to use the HBase C# REST API to create an HBase table and retrieve data from the table. To learn more, see:
360
+
In this tutorial, you learned how to create an Apache HBase cluster. And how to create tables and view the data in those tables from the HBase shell. You also learned how to use a Hive query on data in HBase tables. And how to use the HBase C# REST API to create a HBase table and retrieve data from the table. To learn more, see:
0 commit comments