Skip to content

Commit dd3901e

Browse files
committed
Kafka Connect tutorial
1 parent 530295f commit dd3901e

File tree

1 file changed

+21
-32
lines changed

1 file changed

+21
-32
lines changed

articles/event-hubs/event-hubs-kafka-connect-tutorial.md

Lines changed: 21 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,19 @@
11
---
2-
title: Integrate with Apache Kafka Connect- Azure Event Hubs | Microsoft Docs
3-
description: This article provides information on how to use Kafka Connect with Azure Event Hubs for Kafka.
2+
title: Integrate with Apache Kafka Connect
3+
description: This article provides a walkthrough that shows you how to use Kafka Connect with Azure Event Hubs for Kafka.
44
ms.topic: how-to
5-
ms.date: 05/18/2023
5+
ms.date: 07/31/2024
6+
# customer intent: As a developer, I want to know how to use Apache Kafka Connect with Azure Event Hubs for Kafka.
67
---
78

89
# Integrate Apache Kafka Connect support on Azure Event Hubs
9-
[Apache Kafka Connect](https://kafka.apache.org/documentation/#connect) is a framework to connect and import/export data from/to any external system such as MySQL, HDFS, and file system through a Kafka cluster. This tutorial walks you through using Kafka Connect framework with Event Hubs.
10+
[Apache Kafka Connect](https://kafka.apache.org/documentation/#connect) is a framework to connect and import/export data from/to any external system such as MySQL, HDFS, and file system through a Kafka cluster. This article walks you through using Kafka Connect framework with Event Hubs.
1011

11-
12-
This tutorial walks you through integrating Kafka Connect with an event hub and deploying basic FileStreamSource and FileStreamSink connectors. While these connectors aren't meant for production use, they demonstrate an end-to-end Kafka Connect scenario where Azure Event Hubs acts as a Kafka broker.
12+
This article walks you through integrating Kafka Connect with an event hub and deploying basic `FileStreamSource` and `FileStreamSink` connectors. While these connectors aren't meant for production use, they demonstrate an end-to-end Kafka Connect scenario where Azure Event Hubs acts as a Kafka broker.
1313

1414
> [!NOTE]
1515
> This sample is available on [GitHub](https://github.com/Azure/azure-event-hubs-for-kafka/tree/master/tutorials/connect).
1616
17-
In this tutorial, you take the following steps:
18-
19-
> [!div class="checklist"]
20-
> * Create an Event Hubs namespace
21-
> * Clone the example project
22-
> * Configure Kafka Connect for Event Hubs
23-
> * Run Kafka Connect
24-
> * Create connectors
25-
2617
## Prerequisites
2718
To complete this walkthrough, make sure you have the following prerequisites:
2819

@@ -38,13 +29,13 @@ An Event Hubs namespace is required to send and receive from any Event Hubs serv
3829
## Clone the example project
3930
Clone the Azure Event Hubs repository and navigate to the tutorials/connect subfolder:
4031

41-
```
32+
```bash
4233
git clone https://github.com/Azure/azure-event-hubs-for-kafka.git
4334
cd azure-event-hubs-for-kafka/tutorials/connect
4435
```
4536

4637
## Configure Kafka Connect for Event Hubs
47-
Minimal reconfiguration is necessary when redirecting Kafka Connect throughput from Kafka to Event Hubs. The following `connect-distributed.properties` sample illustrates how to configure Connect to authenticate and communicate with the Kafka endpoint on Event Hubs:
38+
Minimal reconfiguration is necessary when redirecting Kafka Connect throughput from Kafka to Event Hubs. The following `connect-distributed.properties` sample illustrates how to configure Connect to authenticate and communicate with the Kafka endpoint on Event Hubs:
4839

4940
```properties
5041
# e.g. namespace.servicebus.windows.net:9093
@@ -96,41 +87,41 @@ plugin.path={KAFKA.DIRECTORY}/libs # path to the libs directory within the Kafka
9687

9788
In this step, a Kafka Connect worker is started locally in distributed mode, using Event Hubs to maintain cluster state.
9889

99-
1. Save the above `connect-distributed.properties` file locally. Be sure to replace all values in braces.
90+
1. Save the `connect-distributed.properties` file locally. Be sure to replace all values in braces.
10091
2. Navigate to the location of the Kafka release on your machine.
101-
4. Run `./bin/connect-distributed.sh /PATH/TO/connect-distributed.properties`. The Connect worker REST API is ready for interaction when you see `'INFO Finished starting connectors and tasks'`.
92+
4. Run `./bin/connect-distributed.sh /PATH/TO/connect-distributed.properties`. The Connect worker REST API is ready for interaction when you see `'INFO Finished starting connectors and tasks'`.
10293

10394
> [!NOTE]
10495
> Kafka Connect uses the Kafka AdminClient API to automatically create topics with recommended configurations, including compaction. A quick check of the namespace in the Azure portal reveals that the Connect worker's internal topics have been created automatically.
10596
>
10697
>Kafka Connect internal topics **must use compaction**. The Event Hubs team is not responsible for fixing improper configurations if internal Connect topics are incorrectly configured.
10798
10899
### Create connectors
109-
This section walks you through spinning up FileStreamSource and FileStreamSink connectors.
100+
This section walks you through spinning up `FileStreamSource` and `FileStreamSink` connectors.
110101

111102
1. Create a directory for input and output data files.
112103
```bash
113104
mkdir ~/connect-quickstart
114105
```
115106

116-
2. Create two files: one file with seed data from which the FileStreamSource connector reads, and another to which our FileStreamSink connector writes.
107+
2. Create two files: one file with seed data from which the `FileStreamSource` connector reads, and another to which our `FileStreamSink` connector writes.
117108
```bash
118109
seq 1000 > ~/connect-quickstart/input.txt
119110
touch ~/connect-quickstart/output.txt
120111
```
121112

122-
3. Create a FileStreamSource connector. Be sure to replace the curly braces with your home directory path.
113+
3. Create a `FileStreamSource` connector. Be sure to replace the curly braces with your home directory path.
123114
```bash
124115
curl -s -X POST -H "Content-Type: application/json" --data '{"name": "file-source","config": {"connector.class":"org.apache.kafka.connect.file.FileStreamSourceConnector","tasks.max":"1","topic":"connect-quickstart","file": "{YOUR/HOME/PATH}/connect-quickstart/input.txt"}}' http://localhost:8083/connectors
125116
```
126-
You should see the event hub `connect-quickstart` on your Event Hubs instance after running the above command.
117+
You should see the event hub `connect-quickstart` on your Event Hubs instance after running the command.
127118
4. Check status of source connector.
128119
```bash
129120
curl -s http://localhost:8083/connectors/file-source/status
130121
```
131-
Optionally, you can use [Service Bus Explorer](https://github.com/paolosalvatori/ServiceBusExplorer/releases) to verify that events have arrived in the `connect-quickstart` topic.
122+
Optionally, you can use [Service Bus Explorer](https://github.com/paolosalvatori/ServiceBusExplorer/releases) to verify that events arrived in the `connect-quickstart` topic.
132123

133-
5. Create a FileStreamSink Connector. Again, make sure you replace the curly braces with your home directory path.
124+
5. Create a FileStreamSink Connector. Again, make sure you replace the curly braces with your home directory path.
134125
```bash
135126
curl -X POST -H "Content-Type: application/json" --data '{"name": "file-sink", "config": {"connector.class":"org.apache.kafka.connect.file.FileStreamSinkConnector", "tasks.max":"1", "topics":"connect-quickstart", "file": "{YOUR/HOME/PATH}/connect-quickstart/output.txt"}}' http://localhost:8083/connectors
136127
```
@@ -149,15 +140,13 @@ This section walks you through spinning up FileStreamSource and FileStreamSink c
149140
```
150141

151142
### Cleanup
152-
Kafka Connect creates Event Hubs topics to store configurations, offsets, and status that persist even after the Connect cluster has been taken down. Unless this persistence is desired, it's recommended that these topics are deleted. You may also want to delete the `connect-quickstart` Event Hubs that were created during this walkthrough.
143+
Kafka Connect creates Event Hubs topics to store configurations, offsets, and status that persist even after the Connect cluster has been taken down. Unless this persistence is desired, we recommend that you delete these topics. You might also want to delete the `connect-quickstart` Event Hubs that were created during this walkthrough.
153144

154-
## Next steps
145+
## Related content
155146

156147
To learn more about Event Hubs for Kafka, see the following articles:
157148

158-
- [Mirror a Kafka broker in an event hub](event-hubs-kafka-mirror-maker-tutorial.md)
159-
- [Connect Apache Spark to an event hub](event-hubs-kafka-spark-tutorial.md)
160-
- [Connect Apache Flink to an event hub](event-hubs-kafka-flink-tutorial.md)
161-
- [Explore samples on our GitHub](https://github.com/Azure/azure-event-hubs-for-kafka)
162-
- [Connect Akka Streams to an event hub](event-hubs-kafka-akka-streams-tutorial.md)
163149
- [Apache Kafka developer guide for Azure Event Hubs](apache-kafka-developer-guide.md)
150+
- [Explore samples on our GitHub](https://github.com/Azure/azure-event-hubs-for-kafka)
151+
152+

0 commit comments

Comments
 (0)