Skip to content

Commit ee8dbf9

Browse files
author
Sreekanth Iyer (Ushta Te Consultancy Services)
committed
Improved correctness Score
1 parent 3f3caad commit ee8dbf9

File tree

3 files changed

+12
-12
lines changed

3 files changed

+12
-12
lines changed

articles/hdinsight/hbase/apache-hbase-migrate-hdinsight-5-1-new-storage-account.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -240,12 +240,12 @@ You can download AzCopy from [Get started with AzCopy](../../storage/common/stor
240240
## Troubleshooting
241241

242242
### Use case 1:
243-
If Hbase masters and region servers up and regions stuck in transition or only one region i.e. `hbase:meta` region is assigned. Waiting for other regions to assign
243+
If Hbase masters and region servers up and regions stuck in transition, or only one region i.e. `hbase:meta` region is assigned, andWaiting for other regions to assign
244244

245245
**Solution:**
246246

247247
1. ssh into any ZooKeeper node of original cluster and run `kinit -k -t /etc/security/keytabs/hbase.service.keytab hbase/<zk FQDN>` if this is ESP cluster
248-
1. Run `echo "scan `hbase:meta`| hbase shell > meta.out` to read the `hbase:meta` into a file
248+
1. Run `echo scan hbase:meta| hbase shell > meta.out` to read the `hbase:meta` into a file
249249
1. Run `grep "info:sn" meta.out | awk '{print $4}' | sort | uniq` to get all RS instance names where the regions were present in old cluster. Output should be like `value=<wn FQDN>,16020,........`
250250
1. Create a dummy WAL dir with that `wn` value
251251

articles/hdinsight/hbase/apache-hbase-phoenix-psql.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ Before you start loading data, verify that Phoenix is enabled and that query tim
106106

107107
## Use MapReduce to bulk load tables
108108

109-
For higher-throughput loading distributed over the cluster, use the MapReduce load tool. This loader first converts all data into HFiles, and then provides the created HFiles to HBase.
109+
For higher-throughput loading distributed over the cluster, use the MapReduce load tool. This loader first converts all data into `HFiles`, and then provides the created `HFiles` to HBase.
110110

111111
1. This section continues with the ssh session, and objects created earlier. Create the **Customers** table and **customers.csv** file as needed using the steps, above. If necessary, re-establish your ssh connection.
112112

articles/hdinsight/hdinsight-business-continuity-architecture.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ Applications read and write to Spark and Hive Clusters in the primary region whi
8282

8383
Applications read and write to Spark and Hive clusters in the primary region while standby scaled-down Hive and Spark clusters in read-only mode run in secondary region during normal operations. During normal operations, you could choose to offload region specific Hive and Spark read operations to secondary.
8484

85-
:::image type="content" source="./media/hdinsight-business-continuity-architecture/active-primary-standby-secondary-spark.png" alt-text="active primary standby secondary Apache Spark .":::
85+
:::image type="content" source="./media/hdinsight-business-continuity-architecture/active-primary-standby-secondary-spark.png" alt-text="active primary standby secondary Apache Spark.":::
8686

8787
## Apache HBase
8888

@@ -120,15 +120,15 @@ HBase replication can be set up in three modes: Leader-Follower, Leader-Leader a
120120

121121
In this cross-region set up, replication is unidirectional from the primary region to the secondary region. Either all tables or specific tables on the primary can be identified for unidirectional replication. During normal operations, the secondary cluster can be used to serve read requests in its own region.
122122

123-
The secondary cluster operates as a normal HBase cluster that can host its own tables and can serve reads and writes from regional applications. However, writes on the replicated tables or tables native to secondary are not replicated back to the primary.
123+
The secondary cluster operates as a normal HBase cluster that can host its own tables and can serve reads and writes from regional applications. However, write on the replicated tables or tables native to secondary are not replicated back to the primary.
124124

125125
:::image type="content" source="./media/hdinsight-business-continuity-architecture/hbase-leader-follower.png" alt-text="HBase leader follower model.":::
126126

127-
#### HBase Replication: Leader – Leader model
127+
#### HBase Replication: Leader model
128128

129-
This cross-region set up is very similar to the unidirectional set up except that replication happens bidirectionally between the primary region and the secondary region. Applications can use both clusters in read–write modes and updates are exchanges asynchronously between them.
129+
This cross-region setup is very similar to the unidirectional setup except that replication happens bidirectionally between the primary region and the secondary region. Applications can use both clusters in read–write modes and updates are exchanges asynchronously between them.
130130

131-
:::image type="content" source="./media/hdinsight-business-continuity-architecture/hbase-leader-leader.png" alt-text="HBase leader leader model.":::
131+
:::image type="content" source="./media/hdinsight-business-continuity-architecture/hbase-leader-leader.png" alt-text="HBase leader model.":::
132132

133133
#### HBase Replication: Multi-Region or Cyclic
134134

@@ -167,7 +167,7 @@ Disadvantages:
167167

168168
#### Kafka Replication: Active – Active
169169

170-
Active-Active set up involves two regionally separated, VNet peered HDInsight Kafka clusters with bidirectional asynchronous replication with MirrorMaker. In this design, messages consumed by the consumers in the primary are also made available to consumers in secondary and vice versa. Below are some advantages and disadvantages of Active-Active setup.
170+
Active-Active setup involves two regionally separated, VNet peered HDInsight Kafka clusters with bidirectional asynchronous replication with MirrorMaker. In this design, messages consumed by the consumers in the primary are also made available to consumers in secondary and vice versa. Below are some advantages and disadvantages of Active-Active setup.
171171

172172
Advantages:
173173

@@ -179,17 +179,17 @@ Disadvantages:
179179
* The problem of circular replication needs to addressed.
180180
* Bidirectional replication leads to higher regional data egress costs.
181181

182-
:::image type="content" source="./media/hdinsight-business-continuity-architecture/kafka-active-active.png" alt-text="Apache Kafka active active model.":::
182+
:::image type="content" source="./media/hdinsight-business-continuity-architecture/kafka-active-active.png" alt-text="Apache Kafka active model.":::
183183

184184
## HDInsight Enterprise Security Package
185185

186-
This set up is used to enable multi-user functionality in both primary and secondary, as well as [Microsoft Entra Domain Services replica sets](../active-directory-domain-services/tutorial-create-replica-set.md) to ensure that users can authenticate to both clusters. During normal operations, Ranger policies need to be set up in the Secondary to ensure that users are restricted to Read operations. The below architecture explains how an ESP enabled Hive Active Primary – Standby Secondary set up might look.
186+
This set up is used to enable multi-user functionality in both primary and secondary, as well as [Microsoft Entra Domain Services replica sets](../active-directory-domain-services/tutorial-create-replica-set.md) to ensure that users can authenticate to both clusters. During normal operations, Ranger policies need to be set up in the Secondary to ensure that users are restricted to Read operations. The below architecture explains how an ESP enabled Hive Active Primary – Standby Secondary setup might look.
187187

188188
Ranger Metastore replication:
189189

190190
Ranger Metastore is used to persistently store and serve Ranger policies for controlling data authorization. We recommend that you maintain independent Ranger policies in primary and secondary and maintain the secondary as a read replica.
191191

192-
If the requirement is to keep Ranger policies in sync between primary and secondary, use [Ranger Import/Export](https://cwiki.apache.org/confluence/display/RANGER/User+Guide+For+Import-Export) to periodically back-up and import Ranger policies from primary to secondary.
192+
If the requirement is to keep Ranger policies in sync between primary and secondary, use [Ranger Import/Export](https://cwiki.apache.org/confluence/display/RANGER/User+Guide+For+Import-Export) to periodically back up and import Ranger policies from primary to secondary.
193193

194194
Replicating Ranger policies between primary and secondary can cause the secondary to become write-enabled, which can lead to inadvertent writes on the secondary leading to data inconsistencies.
195195

0 commit comments

Comments
 (0)