Skip to content

Commit c8b6b2a

Browse files
authored
Merge pull request #112034 from dagiro/freshness_c14
freshness_c14
2 parents 3662c7c + 52daeaf commit c8b6b2a

File tree

1 file changed

+4
-6
lines changed

1 file changed

+4
-6
lines changed

articles/hdinsight/hdinsight-using-spark-query-hbase.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,20 +7,18 @@ ms.reviewer: jasonh
77
ms.service: hdinsight
88
ms.topic: conceptual
99
ms.custom: hdinsightactive
10-
ms.date: 02/24/2020
10+
ms.date: 04/20/2020
1111
---
1212

1313
# Use Apache Spark to read and write Apache HBase data
1414

15-
Apache HBase is typically queried either with its low-level API (scans, gets, and puts) or with a SQL syntax using Apache Phoenix. Apache also provides the Apache Spark HBase Connector, which is a convenient and performant alternative to query and modify data stored by HBase.
15+
Apache HBase is typically queried either with its low-level API (scans, gets, and puts) or with a SQL syntax using Apache Phoenix. Apache also provides the Apache Spark HBase Connector. The Connector is a convenient and performant alternative to query and modify data stored by HBase.
1616

1717
## Prerequisites
1818

1919
* Two separate HDInsight clusters deployed in the same [virtual network](./hdinsight-plan-virtual-network-deployment.md). One HBase, and one Spark with at least Spark 2.1 (HDInsight 3.6) installed. For more information, see [Create Linux-based clusters in HDInsight using the Azure portal](hdinsight-hadoop-create-linux-clusters-portal.md).
2020

21-
* An SSH client. For more information, see [Connect to HDInsight (Apache Hadoop) using SSH](hdinsight-hadoop-linux-use-ssh-unix.md).
22-
23-
* The [URI scheme](hdinsight-hadoop-linux-information.md#URI-and-scheme) for your clusters primary storage. This scheme would be wasb:// for Azure Blob Storage, abfs:// for Azure Data Lake Storage Gen2 or adl:// for Azure Data Lake Storage Gen1. If secure transfer is enabled for Blob Storage, the URI would be `wasbs://`. See also, [secure transfer](../storage/common/storage-require-secure-transfer.md).
21+
* The URI scheme for your clusters primary storage. This scheme would be wasb:// for Azure Blob Storage, `abfs://` for Azure Data Lake Storage Gen2 or adl:// for Azure Data Lake Storage Gen1. If secure transfer is enabled for Blob Storage, the URI would be `wasbs://`. See also, [secure transfer](../storage/common/storage-require-secure-transfer.md).
2422

2523
## Overall process
2624

@@ -147,7 +145,7 @@ In this step, you define a catalog object that maps the schema from Apache Spark
147145
|}""".stripMargin
148146
```
149147
150-
The code does the following:
148+
The code does the following acts:
151149
152150
a. Define a catalog schema for the HBase table named `Contacts`.
153151
b. Identify the rowkey as `key`, and map the column names used in Spark to the column family, column name, and column type as used in HBase.

0 commit comments

Comments
 (0)