You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/kafka/apache-kafka-mirroring.md
+34-35Lines changed: 34 additions & 35 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,9 +5,9 @@ author: hrasheed-msft
5
5
ms.author: hrasheed
6
6
ms.reviewer: jasonh
7
7
ms.service: hdinsight
8
-
ms.custom: hdinsightactive
9
8
ms.topic: conceptual
10
-
ms.date: 05/24/2019
9
+
ms.custom: hdinsightactive
10
+
ms.date: 11/29/2019
11
11
---
12
12
13
13
# Use MirrorMaker to replicate Apache Kafka topics with Kafka on HDInsight
@@ -75,17 +75,17 @@ This architecture features two clusters in different resource groups and virtual
75
75
76
76
1. Create virtual network peerings. This step will create two peerings: one from **kafka-primary-vnet** to **kafka-secondary-vnet** and one back from **kafka-secondary-vnet** to **kafka-primary-vnet**.
77
77
1. Select the **kafka-primary-vnet** virtual network.
78
-
1.Click**Peerings** under **Settings**.
79
-
1.Click**Add**.
78
+
1.Select**Peerings** under **Settings**.
79
+
1.Select**Add**.
80
80
1. On the **Add peering** screen, enter the details as shown in the screenshot below.
1. Go to the Ambari dashboard for the primary cluster: `https://PRIMARYCLUSTERNAME.azurehdinsight.net`.
86
-
1.Click**Services** > **Kafka**. Click the **Configs** tab.
87
-
1. Add the following config lines to the bottom **kafka-env template** section. Click**Save**.
88
-
86
+
1.Select**Services** > **Kafka**. CliSelectck the **Configs** tab.
87
+
1. Add the following config lines to the bottom **kafka-env template** section. Select**Save**.
88
+
89
89
```
90
90
# Configure Kafka to advertise IP addresses instead of FQDN
91
91
IP_ADDRESS=$(hostname -i)
@@ -95,19 +95,19 @@ This architecture features two clusters in different resource groups and virtual
95
95
```
96
96
97
97
1. Enter a note on the **Save Configuration** screen and click **Save**.
98
-
1. If you are prompted with configuration warning, click **Proceed Anyway**.
99
-
1. Click **Ok** on the **Save Configuration Changes**.
100
-
1. Click **Restart** > **Restart All Affected** in the **Restart Required** notification. Click **Confirm Restart All**.
98
+
1. If you're prompted with configuration warning, click **Proceed Anyway**.
99
+
1. Select **Ok** on the **Save Configuration Changes**.
100
+
1. Select **Restart** > **Restart All Affected** in the **Restart Required** notification. Select **Confirm Restart All**.
101
101
102
102

103
103
104
104
1. Configure Kafka to listen on all network interfaces.
105
105
1. Stay on the **Configs** tab under **Services** > **Kafka**. In the **Kafka Broker** section set the **listeners** property to `PLAINTEXT://0.0.0.0:9092`.
106
-
1. Click **Save**.
107
-
1. Click **Restart**, and **Confirm Restart All**.
106
+
1. Select **Save**.
107
+
1. Select **Restart**, and **Confirm Restart All**.
108
108
109
109
1. Record Broker IP addresses and Zookeeper addresses for primary cluster.
110
-
1. Click **Hosts** on the Ambari dashboard.
110
+
1. Select **Hosts** on the Ambari dashboard.
111
111
1. Make a note of the IP Addresses for the Brokers and Zookeepers. The broker nodes have **wn** as the first two letters of the host name, and the zookeeper nodes have **zk** as the first two letters of the host name.
112
112
113
113

@@ -122,32 +122,32 @@ This architecture features two clusters in different resource groups and virtual
Replace **sshuser** with the SSH user name used when creating the cluster. Replace **BASENAME** with the base name used when creating the cluster.
125
+
Replace **sshuser** with the SSH user name used when creating the cluster. Replace **PRIMARYCLUSTER** with the base name used when creating the cluster.
126
126
127
127
For information, see [Use SSH with HDInsight](../hdinsight-hadoop-linux-use-ssh-unix.md).
128
128
129
-
2. Use the following command to create a variable with the Apache Zookeeper hosts for the primary cluster. The strings like `ZOOKEEPER_IP_ADDRESS1` must be replaced with the actual IP addresses recorded earlier, such as `10.23.0.11` and `10.23.0.7`. If you are using FQDN resolution with a custom DNS server, follow [these steps](apache-kafka-get-started.md#getkafkainfo) to get broker and zookeeper names.:
129
+
1. Use the following command to create a variable with the Apache Zookeeper hosts for the primary cluster. The strings like `ZOOKEEPER_IP_ADDRESS1` must be replaced with the actual IP addresses recorded earlier, such as `10.23.0.11` and `10.23.0.7`. If you're using FQDN resolution with a custom DNS server, follow [these steps](apache-kafka-get-started.md#getkafkainfo) to get broker and zookeeper names.:
4. Use the following to view the Zookeeper host information for this (the **primary**) cluster:
150
+
1. Use the following to view the Zookeeper host information for this (the **primary**) cluster:
151
151
152
152
```bash
153
153
echo $PRIMARY_ZKHOSTS
@@ -157,7 +157,7 @@ This architecture features two clusters in different resource groups and virtual
157
157
158
158
`10.23.0.11:2181,10.23.0.7:2181,10.23.0.9:2181`
159
159
160
-
Save this information. It is used in the next section.
160
+
Save this information. It's used in the next section.
161
161
162
162
## Configure mirroring
163
163
@@ -171,7 +171,7 @@ This architecture features two clusters in different resource groups and virtual
171
171
172
172
For information, see [Use SSH with HDInsight](../hdinsight-hadoop-linux-use-ssh-unix.md).
173
173
174
-
2. A `consumer.properties` file is used to configure communication with the **primary** cluster. To create the file, use the following command:
174
+
1. A `consumer.properties` file is used to configure communication with the **primary** cluster. To create the file, use the following command:
175
175
176
176
```bash
177
177
nano consumer.properties
@@ -190,7 +190,7 @@ This architecture features two clusters in different resource groups and virtual
190
190
191
191
To save the file, use **Ctrl + X**, **Y**, and then **Enter**.
192
192
193
-
3. Before configuring the producer that communicates with the secondary cluster, setup a variable for the broker IP addresses of the **secondary** cluster. Use the following commands to create this variable:
193
+
1. Before configuring the producer that communicates with the secondary cluster, set up a variable for the broker IP addresses of the **secondary** cluster. Use the following commands to create this variable:
7. The default configuration for Kafka on HDInsight does not allow the automatic creation of topics. You must use one of the following options before starting the Mirroring process:
227
+
1. The default configuration for Kafka on HDInsight doesn't allow the automatic creation of topics. You must use one of the following options before starting the Mirroring process:
228
228
229
229
* **Create the topics on the secondary cluster**: This option also allows you to set the number of partitions and the replication factor.
230
230
@@ -242,9 +242,9 @@ This architecture features two clusters in different resource groups and virtual
242
242
243
243
1. Go to the Ambari dashboard for the secondary cluster: `https://SECONDARYCLUSTERNAME.azurehdinsight.net`.
244
244
1. Click **Services** > **Kafka**. Click the **Configs** tab.
245
-
5. In the __Filter__ field, enter a value of `auto.create`. This filters the list of properties and displays the `auto.create.topics.enable` setting.
246
-
6. Change the value of `auto.create.topics.enable` to true, and then select __Save__. Add a note, and then select __Save__ again.
247
-
7. Select the __Kafka__ service, select __Restart__, and then select __Restart all affected__. When prompted, select __Confirm restart all__.
245
+
1. In the __Filter__ field, enter a value of `auto.create`. This filters the list of properties and displays the `auto.create.topics.enable` setting.
246
+
1. Change the value of `auto.create.topics.enable` to true, and then select __Save__. Add a note, and then select __Save__ again.
247
+
1. Select the __Kafka__ service, select __Restart__, and then select __Restart all affected__. When prompted, select __Confirm restart all__.
248
248
249
249

250
250
@@ -258,13 +258,12 @@ This architecture features two clusters in different resource groups and virtual
258
258
259
259
The parameters used in this example are:
260
260
261
-
* **--consumer.config**: Specifies the file that contains consumer properties. These properties are used to create a consumer that reads from the *primary* Kafka cluster.
262
-
263
-
* **--producer.config**: Specifies the file that contains producer properties. These properties are used to create a producer that writes to the *secondary* Kafka cluster.
264
-
265
-
* **--whitelist**: A list of topics that MirrorMaker replicates from the primary cluster to the secondary.
266
-
267
-
* **--num.streams**: The number of consumer threads to create.
261
+
|Parameter |Description |
262
+
|---|---|
263
+
|--consumer.config|Specifies the file that contains consumer properties. These properties are used to create a consumer that reads from the *primary* Kafka cluster.|
264
+
|--producer.config|Specifies the file that contains producer properties. These properties are used to create a producer that writes to the *secondary* Kafka cluster.|
265
+
|--whitelist|A list of topics that MirrorMaker replicates from the primary cluster to the secondary.|
266
+
|--num.streams|The number of consumer threads to create.|
268
267
269
268
The consumer on the secondary node is now waiting to receive messages.
270
269
@@ -291,7 +290,7 @@ This architecture features two clusters in different resource groups and virtual
291
290
292
291
The steps in this document created clusters in different Azure resource groups. To delete all of the resources created, you can delete the two resource groups created: **kafka-primary-rg** and **kafka-secondary_rg**. Deleting the resource groups removes all of the resources created by following this document, including clusters, virtual networks, and storage accounts.
293
292
294
-
## Next Steps
293
+
## Next steps
295
294
296
295
In this document, you learned how to use [MirrorMaker](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330) to create a replica of an [Apache Kafka](https://kafka.apache.org/) cluster. Use the following links to discover other ways to work with Kafka:
0 commit comments