Skip to content

Commit 3225438

Browse files
authored
Merge pull request #106325 from dagiro/freshness14
freshness14
2 parents 9cdc0e2 + 966575f commit 3225438

File tree

1 file changed

+14
-17
lines changed

1 file changed

+14
-17
lines changed

articles/hdinsight/hbase/apache-hbase-overview.md

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,13 @@
22
title: What is Apache HBase in Azure HDInsight?
33
description: An introduction to Apache HBase in HDInsight, a NoSQL database build on Hadoop. Learn about use cases and compare HBase to other Hadoop clusters.
44
author: hrasheed-msft
5+
ms.author: hrasheed
56
ms.reviewer: jasonh
67
ms.service: hdinsight
7-
ms.custom: hdinsightactive,hdiseo17may2017
88
ms.topic: overview
9-
ms.date: 06/12/2019
10-
ms.author: hrasheed
9+
ms.custom: hdinsightactive,hdiseo17may2017
10+
ms.date: 03/03/2020
11+
1112
#Customer intent: As a developer new to Apache HBase and Apache HBase in Azure HDInsight, I want to have a basic understanding of Microsoft's implementation of Apache HBase in Azure HDInsight so I can decide if I want to use it rather than build my own cluster.
1213
---
1314

@@ -19,31 +20,27 @@ From user perspective, HBase is similar to a database. Data is stored in the row
1920

2021
## How is Apache HBase implemented in Azure HDInsight?
2122

22-
HDInsight HBase is offered as a managed cluster that is integrated into the Azure environment. The clusters are configured to store data directly in [Azure Storage](./../hdinsight-hadoop-use-blob-storage.md) which provides low latency and increased elasticity in performance and cost choices. This enables customers to build interactive websites that work with large datasets, to build services that store sensor and telemetry data from millions of end points, and to analyze this data with Hadoop jobs. HBase and Hadoop are good starting points for big data project in Azure; in particular, they can enable real-time applications to work with large datasets.
23+
HDInsight HBase is offered as a managed cluster that is integrated into the Azure environment. The clusters are configured to store data directly in [Azure Storage](./../hdinsight-hadoop-use-blob-storage.md), which provides low latency and increased elasticity in performance and cost choices. This enables customers to build interactive websites that work with large datasets, to build services that store sensor and telemetry data from millions of end points, and to analyze this data with Hadoop jobs. HBase and Hadoop are good starting points for big data project in Azure; in particular, they can enable real-time applications to work with large datasets.
2324

2425
The HDInsight implementation leverages the scale-out architecture of HBase to provide automatic sharding of tables, strong consistency for reads and writes, and automatic failover. Performance is enhanced by in-memory caching for reads and high-throughput streaming for writes. HBase cluster can be created inside virtual network. For details, see [Create HDInsight clusters on Azure Virtual Network](./apache-hbase-provision-vnet.md).
2526

2627
## How is data managed in HDInsight HBase?
28+
2729
Data can be managed in HBase by using the `create`, `get`, `put`, and `scan` commands from the HBase shell. Data is written to the database by using `put` and read by using `get`. The `scan` command is used to obtain data from multiple rows in a table. Data can also be managed using the HBase C# API, which provides a client library on top of the HBase REST API. An HBase database can also be queried by using [Apache Hive](https://hive.apache.org/). For an introduction to these programming models, see [Get started using Apache HBase with Apache Hadoop in HDInsight](./apache-hbase-tutorial-get-started-linux.md). Coprocessors are also available, which allow data processing in the nodes that host the database.
2830

2931
> [!NOTE]
3032
> Thrift is not supported by HBase in HDInsight.
3133
32-
## Scenarios: Use cases for Apache HBase
34+
## Use cases for Apache HBase
35+
3336
The canonical use case for which BigTable (and by extension, HBase) was created from web search. Search engines build indexes that map terms to the web pages that contain them. But there are many other use cases that HBase is suitable for—several of which are itemized in this section.
3437

35-
* Key-value store
36-
37-
HBase can be used as a key-value store, and it is suitable for managing message systems. Facebook uses HBase for their messaging system, and it is ideal for storing and managing Internet communications. WebTable uses HBase to search for and manage tables that are extracted from webpages.
38-
* Sensor data
39-
40-
HBase is useful for capturing data that is collected incrementally from various sources. This includes social analytics, time series, keeping interactive dashboards up-to-date with trends and counters, and managing audit log systems. Examples include Bloomberg trader terminal and the Open Time Series Database (OpenTSDB), which stores and provides access to metrics collected about the health of server systems.
41-
* Real-time query
42-
43-
[Apache Phoenix](https://phoenix.apache.org/) is a SQL query engine for Apache HBase. It is accessed as a JDBC driver, and it enables querying and managing HBase tables by using SQL.
44-
* HBase as a platform
45-
46-
Applications can run on top of HBase by using it as a datastore. Examples include Phoenix, [OpenTSDB](http://opentsdb.net/), Kiji, and Titan. Applications can also integrate with HBase. Examples include [Apache Hive](https://hive.apache.org/), [Apache Pig](https://pig.apache.org/), [Solr](https://lucene.apache.org/solr/), [Apache Storm](https://storm.apache.org/), [Apache Flume](https://flume.apache.org/), [Apache Impala](https://impala.apache.org/), [Apache Spark](https://spark.apache.org/) , [Ganglia](http://ganglia.info/), and [Apache Drill](https://drill.apache.org/).
38+
|Scenario |Description |
39+
|---|---|
40+
|Key-value store|HBase can be used as a key-value store, and it's suitable for managing message systems. Facebook uses HBase for their messaging system, and it's ideal for storing and managing Internet communications. WebTable uses HBase to search for and manage tables that are extracted from webpages.|
41+
|Sensor data|HBase is useful for capturing data that is collected incrementally from various sources. This includes social analytics, time series, keeping interactive dashboards up to date with trends and counters, and managing audit log systems. Examples include Bloomberg trader terminal and the Open Time Series Database (OpenTSDB), which stores and provides access to metrics collected about the health of server systems.|
42+
|Real-time query|[Apache Phoenix](https://phoenix.apache.org/) is a SQL query engine for Apache HBase. It's accessed as a JDBC driver, and it enables querying and managing HBase tables by using SQL.|
43+
|HBase as a platform|Applications can run on top of HBase by using it as a datastore. Examples include Phoenix, [OpenTSDB](http://opentsdb.net/), Kiji, and Titan. Applications can also integrate with HBase. Examples include [Apache Hive](https://hive.apache.org/), [Apache Pig](https://pig.apache.org/), [Solr](https://lucene.apache.org/solr/), [Apache Storm](https://storm.apache.org/), [Apache Flume](https://flume.apache.org/), [Apache Impala](https://impala.apache.org/), [Apache Spark](https://spark.apache.org/) , [Ganglia](http://ganglia.info/), and [Apache Drill](https://drill.apache.org/).|
4744

4845
## Next steps
4946

0 commit comments

Comments
 (0)