Skip to content

Commit 03522ab

Browse files
authored
Merge pull request #97412 from dagiro/freshness82
freshness82
2 parents 3fb46be + 3a0d409 commit 03522ab

File tree

1 file changed

+34
-35
lines changed

1 file changed

+34
-35
lines changed

articles/hdinsight/kafka/apache-kafka-mirroring.md

Lines changed: 34 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ author: hrasheed-msft
55
ms.author: hrasheed
66
ms.reviewer: jasonh
77
ms.service: hdinsight
8-
ms.custom: hdinsightactive
98
ms.topic: conceptual
10-
ms.date: 05/24/2019
9+
ms.custom: hdinsightactive
10+
ms.date: 11/29/2019
1111
---
1212

1313
# Use MirrorMaker to replicate Apache Kafka topics with Kafka on HDInsight
@@ -75,17 +75,17 @@ This architecture features two clusters in different resource groups and virtual
7575

7676
1. Create virtual network peerings. This step will create two peerings: one from **kafka-primary-vnet** to **kafka-secondary-vnet** and one back from **kafka-secondary-vnet** to **kafka-primary-vnet**.
7777
1. Select the **kafka-primary-vnet** virtual network.
78-
1. Click **Peerings** under **Settings**.
79-
1. Click **Add**.
78+
1. Select **Peerings** under **Settings**.
79+
1. Select **Add**.
8080
1. On the **Add peering** screen, enter the details as shown in the screenshot below.
8181

8282
![HDInsight Kafka add vnet peering](./media/apache-kafka-mirroring/hdi-add-vnet-peering.png)
8383

8484
1. Configure IP advertising:
8585
1. Go to the Ambari dashboard for the primary cluster: `https://PRIMARYCLUSTERNAME.azurehdinsight.net`.
86-
1. Click **Services** > **Kafka**. Click the **Configs** tab.
87-
1. Add the following config lines to the bottom **kafka-env template** section. Click **Save**.
88-
86+
1. Select **Services** > **Kafka**. CliSelectck the **Configs** tab.
87+
1. Add the following config lines to the bottom **kafka-env template** section. Select **Save**.
88+
8989
```
9090
# Configure Kafka to advertise IP addresses instead of FQDN
9191
IP_ADDRESS=$(hostname -i)
@@ -95,19 +95,19 @@ This architecture features two clusters in different resource groups and virtual
9595
```
9696
9797
1. Enter a note on the **Save Configuration** screen and click **Save**.
98-
1. If you are prompted with configuration warning, click **Proceed Anyway**.
99-
1. Click **Ok** on the **Save Configuration Changes**.
100-
1. Click **Restart** > **Restart All Affected** in the **Restart Required** notification. Click **Confirm Restart All**.
98+
1. If you're prompted with configuration warning, click **Proceed Anyway**.
99+
1. Select **Ok** on the **Save Configuration Changes**.
100+
1. Select **Restart** > **Restart All Affected** in the **Restart Required** notification. Select **Confirm Restart All**.
101101
102102
![Apache Ambari restart all affected](./media/apache-kafka-mirroring/ambari-restart-notification.png)
103103
104104
1. Configure Kafka to listen on all network interfaces.
105105
1. Stay on the **Configs** tab under **Services** > **Kafka**. In the **Kafka Broker** section set the **listeners** property to `PLAINTEXT://0.0.0.0:9092`.
106-
1. Click **Save**.
107-
1. Click **Restart**, and **Confirm Restart All**.
106+
1. Select **Save**.
107+
1. Select **Restart**, and **Confirm Restart All**.
108108
109109
1. Record Broker IP addresses and Zookeeper addresses for primary cluster.
110-
1. Click **Hosts** on the Ambari dashboard.
110+
1. Select **Hosts** on the Ambari dashboard.
111111
1. Make a note of the IP Addresses for the Brokers and Zookeepers. The broker nodes have **wn** as the first two letters of the host name, and the zookeeper nodes have **zk** as the first two letters of the host name.
112112
113113
![Apache Ambari view node ip addresses](./media/apache-kafka-mirroring/view-node-ip-addresses2.png)
@@ -122,32 +122,32 @@ This architecture features two clusters in different resource groups and virtual
122122
123123
```
124124
125-
Replace **sshuser** with the SSH user name used when creating the cluster. Replace **BASENAME** with the base name used when creating the cluster.
125+
Replace **sshuser** with the SSH user name used when creating the cluster. Replace **PRIMARYCLUSTER** with the base name used when creating the cluster.
126126
127127
For information, see [Use SSH with HDInsight](../hdinsight-hadoop-linux-use-ssh-unix.md).
128128
129-
2. Use the following command to create a variable with the Apache Zookeeper hosts for the primary cluster. The strings like `ZOOKEEPER_IP_ADDRESS1` must be replaced with the actual IP addresses recorded earlier, such as `10.23.0.11` and `10.23.0.7`. If you are using FQDN resolution with a custom DNS server, follow [these steps](apache-kafka-get-started.md#getkafkainfo) to get broker and zookeeper names.:
129+
1. Use the following command to create a variable with the Apache Zookeeper hosts for the primary cluster. The strings like `ZOOKEEPER_IP_ADDRESS1` must be replaced with the actual IP addresses recorded earlier, such as `10.23.0.11` and `10.23.0.7`. If you're using FQDN resolution with a custom DNS server, follow [these steps](apache-kafka-get-started.md#getkafkainfo) to get broker and zookeeper names.:
130130
131131
```bash
132132
# get the zookeeper hosts for the primary cluster
133133
export PRIMARY_ZKHOSTS='ZOOKEEPER_IP_ADDRESS1:2181, ZOOKEEPER_IP_ADDRESS2:2181, ZOOKEEPER_IP_ADDRESS3:2181'
134134
```
135135
136-
3. To create a topic named `testtopic`, use the following command:
136+
1. To create a topic named `testtopic`, use the following command:
137137
138138
```bash
139139
/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --create --replication-factor 2 --partitions 8 --topic testtopic --zookeeper $PRIMARY_ZKHOSTS
140140
```
141141
142-
3. Use the following command to verify that the topic was created:
142+
1. Use the following command to verify that the topic was created:
143143
144144
```bash
145145
/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --list --zookeeper $PRIMARY_ZKHOSTS
146146
```
147147
148148
The response contains `testtopic`.
149149
150-
4. Use the following to view the Zookeeper host information for this (the **primary**) cluster:
150+
1. Use the following to view the Zookeeper host information for this (the **primary**) cluster:
151151
152152
```bash
153153
echo $PRIMARY_ZKHOSTS
@@ -157,7 +157,7 @@ This architecture features two clusters in different resource groups and virtual
157157
158158
`10.23.0.11:2181,10.23.0.7:2181,10.23.0.9:2181`
159159
160-
Save this information. It is used in the next section.
160+
Save this information. It's used in the next section.
161161
162162
## Configure mirroring
163163
@@ -171,7 +171,7 @@ This architecture features two clusters in different resource groups and virtual
171171
172172
For information, see [Use SSH with HDInsight](../hdinsight-hadoop-linux-use-ssh-unix.md).
173173
174-
2. A `consumer.properties` file is used to configure communication with the **primary** cluster. To create the file, use the following command:
174+
1. A `consumer.properties` file is used to configure communication with the **primary** cluster. To create the file, use the following command:
175175
176176
```bash
177177
nano consumer.properties
@@ -190,7 +190,7 @@ This architecture features two clusters in different resource groups and virtual
190190
191191
To save the file, use **Ctrl + X**, **Y**, and then **Enter**.
192192
193-
3. Before configuring the producer that communicates with the secondary cluster, setup a variable for the broker IP addresses of the **secondary** cluster. Use the following commands to create this variable:
193+
1. Before configuring the producer that communicates with the secondary cluster, set up a variable for the broker IP addresses of the **secondary** cluster. Use the following commands to create this variable:
194194
195195
```bash
196196
export SECONDARY_BROKERHOSTS='BROKER_IP_ADDRESS1:9092,BROKER_IP_ADDRESS2:9092,BROKER_IP_ADDRESS2:9092'
@@ -200,7 +200,7 @@ This architecture features two clusters in different resource groups and virtual
200200
201201
`10.23.0.14:9092,10.23.0.4:9092,10.23.0.12:9092`
202202
203-
4. A `producer.properties` file is used to communicate the **secondary** cluster. To create the file, use the following command:
203+
1. A `producer.properties` file is used to communicate the **secondary** cluster. To create the file, use the following command:
204204
205205
```bash
206206
nano producer.properties
@@ -217,14 +217,14 @@ This architecture features two clusters in different resource groups and virtual
217217
218218
For more information producer configuration, see [Producer Configs](https://kafka.apache.org/documentation#producerconfigs) at kafka.apache.org.
219219
220-
5. Use the following commands to create an environment variable with the IP addresses of the Zookeeper hosts for the secondary cluster:
220+
1. Use the following commands to create an environment variable with the IP addresses of the Zookeeper hosts for the secondary cluster:
221221
222222
```bash
223223
# get the zookeeper hosts for the secondary cluster
224224
export SECONDARY_ZKHOSTS='ZOOKEEPER_IP_ADDRESS1:2181,ZOOKEEPER_IP_ADDRESS2:2181,ZOOKEEPER_IP_ADDRESS3:2181'
225225
```
226226
227-
7. The default configuration for Kafka on HDInsight does not allow the automatic creation of topics. You must use one of the following options before starting the Mirroring process:
227+
1. The default configuration for Kafka on HDInsight doesn't allow the automatic creation of topics. You must use one of the following options before starting the Mirroring process:
228228
229229
* **Create the topics on the secondary cluster**: This option also allows you to set the number of partitions and the replication factor.
230230
@@ -242,9 +242,9 @@ This architecture features two clusters in different resource groups and virtual
242242
243243
1. Go to the Ambari dashboard for the secondary cluster: `https://SECONDARYCLUSTERNAME.azurehdinsight.net`.
244244
1. Click **Services** > **Kafka**. Click the **Configs** tab.
245-
5. In the __Filter__ field, enter a value of `auto.create`. This filters the list of properties and displays the `auto.create.topics.enable` setting.
246-
6. Change the value of `auto.create.topics.enable` to true, and then select __Save__. Add a note, and then select __Save__ again.
247-
7. Select the __Kafka__ service, select __Restart__, and then select __Restart all affected__. When prompted, select __Confirm restart all__.
245+
1. In the __Filter__ field, enter a value of `auto.create`. This filters the list of properties and displays the `auto.create.topics.enable` setting.
246+
1. Change the value of `auto.create.topics.enable` to true, and then select __Save__. Add a note, and then select __Save__ again.
247+
1. Select the __Kafka__ service, select __Restart__, and then select __Restart all affected__. When prompted, select __Confirm restart all__.
248248
249249
![kafka enable auto create topics](./media/apache-kafka-mirroring/kafka-enable-auto-create-topics.png)
250250
@@ -258,13 +258,12 @@ This architecture features two clusters in different resource groups and virtual
258258
259259
The parameters used in this example are:
260260
261-
* **--consumer.config**: Specifies the file that contains consumer properties. These properties are used to create a consumer that reads from the *primary* Kafka cluster.
262-
263-
* **--producer.config**: Specifies the file that contains producer properties. These properties are used to create a producer that writes to the *secondary* Kafka cluster.
264-
265-
* **--whitelist**: A list of topics that MirrorMaker replicates from the primary cluster to the secondary.
266-
267-
* **--num.streams**: The number of consumer threads to create.
261+
|Parameter |Description |
262+
|---|---|
263+
|--consumer.config|Specifies the file that contains consumer properties. These properties are used to create a consumer that reads from the *primary* Kafka cluster.|
264+
|--producer.config|Specifies the file that contains producer properties. These properties are used to create a producer that writes to the *secondary* Kafka cluster.|
265+
|--whitelist|A list of topics that MirrorMaker replicates from the primary cluster to the secondary.|
266+
|--num.streams|The number of consumer threads to create.|
268267
269268
The consumer on the secondary node is now waiting to receive messages.
270269
@@ -291,7 +290,7 @@ This architecture features two clusters in different resource groups and virtual
291290
292291
The steps in this document created clusters in different Azure resource groups. To delete all of the resources created, you can delete the two resource groups created: **kafka-primary-rg** and **kafka-secondary_rg**. Deleting the resource groups removes all of the resources created by following this document, including clusters, virtual networks, and storage accounts.
293292
294-
## Next Steps
293+
## Next steps
295294
296295
In this document, you learned how to use [MirrorMaker](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330) to create a replica of an [Apache Kafka](https://kafka.apache.org/) cluster. Use the following links to discover other ways to work with Kafka:
297296

0 commit comments

Comments
 (0)