Skip to content

Commit e466608

Browse files
Merge pull request #292313 from sreekzz/patch-268753
Added a new Notes for HBase API Scanner
2 parents ea24909 + 8f991bd commit e466608

File tree

1 file changed

+20
-16
lines changed

1 file changed

+20
-16
lines changed

articles/hdinsight/hbase/apache-hbase-tutorial-get-started-linux.md

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Follow this Apache HBase tutorial to start using hadoop on HDInsigh
44
ms.service: azure-hdinsight
55
ms.topic: tutorial
66
ms.custom: hdinsightactive, linux-related-content
7-
ms.date: 05/10/2024
7+
ms.date: 12/23/2024
88
---
99

1010
# Tutorial: Use Apache HBase in Azure HDInsight
@@ -24,13 +24,13 @@ In this tutorial, you learn how to:
2424

2525
* An SSH client. For more information, see [Connect to HDInsight (Apache Hadoop) using SSH](../hdinsight-hadoop-linux-use-ssh-unix.md).
2626

27-
* Bash. The examples in this article use the Bash shell on Windows 10 for the curl commands. See [Windows Subsystem for Linux Installation Guide for Windows 10](/windows/wsl/install-win10) for installation steps. Other [Unix shells](https://www.gnu.org/software/bash/) will work as well. The curl examples, with some slight modifications, can work on a Windows Command prompt. Or you can use the Windows PowerShell cmdlet [Invoke-RestMethod](/powershell/module/microsoft.powershell.utility/invoke-restmethod).
27+
* Bash. The examples in this article use the Bash shell on Windows 10 for the curl commands. See [Windows Subsystem for Linux Installation Guide for Windows 10](/windows/wsl/install-win10) for installation steps. Other [Unix shells](https://www.gnu.org/software/bash/) work as well. The curl examples, with some slight modifications, can work on a Windows Command prompt. Or you can use the Windows PowerShell cmdlet [Invoke-RestMethod](/powershell/module/microsoft.powershell.utility/invoke-restmethod).
2828

2929
## Create Apache HBase cluster
3030

31-
The following procedure uses an Azure Resource Manager template to create an HBase cluster. The template also creates the dependent default Azure Storage account. To understand the parameters used in the procedure and other cluster creation methods, see [Create Linux-based Hadoop clusters in HDInsight](../hdinsight-hadoop-provision-linux-clusters.md).
31+
The following procedure uses an Azure Resource Manager template to create a HBase cluster. The template also creates the dependent default Azure Storage account. To understand the parameters used in the procedure and other cluster creation methods, see [Create Linux-based Hadoop clusters in HDInsight](../hdinsight-hadoop-provision-linux-clusters.md).
3232

33-
1. Select the following image to open the template in the Azure portal. The template is located in [Azure quickstart templates](https://azure.microsoft.com/resources/templates/).
33+
1. Select the following image to open the template in the Azure portal. The template is located in [Azure Quickstart Templates](https://azure.microsoft.com/resources/templates/).
3434

3535
<a href="https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fazure-quickstart-templates%2Fmaster%2Fquickstarts%2Fmicrosoft.hdinsight%2Fhdinsight-hbase-linux%2Fazuredeploy.json" target="_blank"><img src="./media/apache-hbase-tutorial-get-started-linux/hdi-deploy-to-azure1.png" alt="Deploy to Azure button for new cluster"></a>
3636

@@ -51,7 +51,7 @@ The following procedure uses an Azure Resource Manager template to create an HBa
5151

5252
3. Select **I agree to the terms and conditions stated above**, and then select **Purchase**. It takes about 20 minutes to create a cluster.
5353

54-
After an HBase cluster is deleted, you can create another HBase cluster by using the same default blob container. The new cluster picks up the HBase tables you created in the original cluster. To avoid inconsistencies, we recommend that you disable the HBase tables before you delete the cluster.
54+
After a HBase cluster is deleted, you can create another HBase cluster by using the same default blob container. The new cluster picks up the HBase tables you created in the original cluster. To avoid inconsistencies, we recommend that you disable the HBase tables before you delete the cluster.
5555

5656
## Create tables and insert data
5757

@@ -67,7 +67,7 @@ In HBase (an implementation of [Cloud BigTable](https://cloud.google.com/bigtabl
6767

6868
**To use the HBase shell**
6969

70-
1. Use `ssh` command to connect to your HBase cluster. Edit the command below by replacing `CLUSTERNAME` with the name of your cluster, and then enter the command:
70+
1. Use `ssh` command to connect to your HBase cluster. Edit the following command by replacing `CLUSTERNAME` with the name of your cluster, and then enter the command:
7171

7272
```cmd
7373
@@ -79,7 +79,7 @@ In HBase (an implementation of [Cloud BigTable](https://cloud.google.com/bigtabl
7979
hbase shell
8080
```
8181
82-
1. Use `create` command to create an HBase table with two-column families. The table and column names are case-sensitive. Enter the following command:
82+
1. Use `create` command to create a HBase table with two-column families. The table and column names are case-sensitive. Enter the following command:
8383
8484
```hbaseshell
8585
create 'Contacts', 'Personal', 'Office'
@@ -204,23 +204,24 @@ You can query data in HBase tables by using [Apache Hive](https://hive.apache.or
204204
The Hive query to access HBase data need not be executed from the HBase cluster. Any cluster that comes with Hive (including Spark, Hadoop, HBase, or Interactive Query) can be used to query HBase data, provided the following steps are completed:
205205
206206
1. Both clusters must be attached to the same Virtual Network and Subnet
207-
2. Copy `/usr/hdp/$(hdp-select --version)/hbase/conf/hbase-site.xml` from the HBase cluster headnodes to the Hive cluster headnodes and workernodes.
207+
2. Copy `/usr/hdp/$(hdp-select --version)/hbase/conf/hbase-site.xml` from the HBase cluster headnodes to the Hive cluster headnodes and worker nodes.
208208
209209
### Secure Clusters
210210
211211
HBase data can also be queried from Hive using ESP-enabled HBase:
212212
213213
1. When following a multi-cluster pattern, both clusters must be ESP-enabled.
214214
2. To allow Hive to query HBase data, make sure that the `hive` user is granted permissions to access the HBase data via the Hbase Apache Ranger plugin
215-
3. When using separate, ESP-enabled clusters, the contents of `/etc/hosts` from the HBase cluster headnodes must be appended to `/etc/hosts` of the Hive cluster headnodes and workernodes.
215+
3. When you use separate, ESP-enabled clusters, the contents of `/etc/hosts` from the HBase cluster headnodes must be appended to `/etc/hosts` of the Hive cluster headnodes and worker nodes.
216216
> [!NOTE]
217217
> After scaling either clusters, `/etc/hosts` must be appended again
218218
219219
## Use the HBase REST API via Curl
220220
221221
The HBase REST API is secured via [basic authentication](https://en.wikipedia.org/wiki/Basic_access_authentication). You shall always make requests by using Secure HTTP (HTTPS) to help ensure that your credentials are securely sent to the server.
222222
223-
1. To enable the HBase REST API in the HDInsight cluster, add the following custom startup script to the **Script Action** section. You can add the startup script when you create the cluster or after the cluster has been created. For **Node Type**, select **Region Servers** to ensure that the script executes only in HBase Region Servers.
223+
1. To enable the HBase REST API in the HDInsight cluster, add the following custom startup script to the **Script Action** section. You can add the startup script when you create the cluster or after the cluster has been created. For **Node Type**, select **Region Servers** to ensure that the script executes only in HBase Region Servers. Script starts HBase REST proxy on 8090 port on Region servers.
224+
224225
225226
```bash
226227
#! /bin/bash
@@ -244,7 +245,7 @@ The HBase REST API is secured via [basic authentication](https://en.wikipedia.or
244245
fi
245246
```
246247
247-
1. Set environment variable for ease of use. Edit the commands below by replacing `MYPASSWORD` with the cluster login password. Replace `MYCLUSTERNAME` with the name of your HBase cluster. Then enter the commands.
248+
1. Set environment variable for ease of use. Edit the following commands by replacing `MYPASSWORD` with the cluster login password. Replace `MYCLUSTERNAME` with the name of your HBase cluster. Then enter the commands.
248249
249250
```bash
250251
export PASSWORD='MYPASSWORD'
@@ -269,7 +270,7 @@ The HBase REST API is secured via [basic authentication](https://en.wikipedia.or
269270
-v
270271
```
271272
272-
The schema is provided in the JSon format.
273+
The schema is provided in the JSON format.
273274
1. Use the following command to insert some data:
274275
275276
```bash
@@ -298,12 +299,15 @@ The HBase REST API is secured via [basic authentication](https://en.wikipedia.or
298299
-v
299300
```
300301
302+
> [!NOTE]
303+
> Scan through the cluster endpoint is not supported yet.
304+
301305
For more information about HBase Rest, see [Apache HBase Reference Guide](https://hbase.apache.org/book.html#_rest).
302306
303-
> [!NOTE]
307+
> [!NOTE]
304308
> Thrift is not supported by HBase in HDInsight.
305309
>
306-
> When using Curl or any other REST communication with WebHCat, you must authenticate the requests by providing the user name and password for the HDInsight cluster administrator. You must also use the cluster name as part of the Uniform Resource Identifier (URI) used to send the requests to the server:
310+
> When you use Curl or any other REST communication with WebHCat, you must authenticate the requests by providing the user name and password for the HDInsight cluster administrator. You must also use the cluster name as part of the Uniform Resource Identifier (URI) used to send the requests to the server:
307311
>
308312
> `curl -u <UserName>:<Password> \`
309313
>
@@ -337,7 +341,7 @@ HBase in HDInsight ships with a Web UI for monitoring clusters. Using the Web UI
337341
338342
## Cluster recreation
339343
340-
After an HBase cluster is deleted, you can create another HBase cluster by using the same default blob container. The new cluster picks up the HBase tables you created in the original cluster. To avoid inconsistencies, however, we recommend that you disable the HBase tables before you delete the cluster.
344+
After a HBase cluster is deleted, you can create another HBase cluster by using the same default blob container. The new cluster picks up the HBase tables you created in the original cluster. To avoid inconsistencies, however, we recommend that you disable the HBase tables before you delete the cluster.
341345
342346
You can use the HBase command `disable 'Contacts'`.
343347
@@ -353,7 +357,7 @@ If you're not going to continue to use this application, delete the HBase cluste
353357
354358
## Next steps
355359
356-
In this tutorial, you learned how to create an Apache HBase cluster. And how to create tables and view the data in those tables from the HBase shell. You also learned how to use a Hive query on data in HBase tables. And how to use the HBase C# REST API to create an HBase table and retrieve data from the table. To learn more, see:
360+
In this tutorial, you learned how to create an Apache HBase cluster. And how to create tables and view the data in those tables from the HBase shell. You also learned how to use a Hive query on data in HBase tables. And how to use the HBase C# REST API to create a HBase table and retrieve data from the table. To learn more, see:
357361
358362
> [!div class="nextstepaction"]
359363
> [HDInsight HBase overview](./apache-hbase-overview.md)

0 commit comments

Comments
 (0)