Skip to content

Commit ea13549

Browse files
authored
Merge pull request #198176 from seesharprun/may13-kafka-edit
Update Apache Kafka Connect articles for Azure Cosmos DB
2 parents f8c0842 + f9780c5 commit ea13549

File tree

9 files changed

+28
-28
lines changed

9 files changed

+28
-28
lines changed

articles/cosmos-db/sql/kafka-connector-sink.md

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -5,26 +5,26 @@ author: kushagrathapar
55
ms.service: cosmos-db
66
ms.subservice: cosmosdb-sql
77
ms.topic: conceptual
8-
ms.date: 06/28/2021
8+
ms.date: 05/13/2022
99
ms.author: kuthapar
1010
---
1111

12-
# Kafka Connect for Azure Cosmos DB - Sink connector
12+
# Kafka Connect for Azure Cosmos DB - sink connector
1313
[!INCLUDE[appliesto-sql-api](../includes/appliesto-sql-api.md)]
1414

1515
Kafka Connect for Azure Cosmos DB is a connector to read from and write data to Azure Cosmos DB. The Azure Cosmos DB sink connector allows you to export data from Apache Kafka topics to an Azure Cosmos DB database. The connector polls data from Kafka to write to containers in the database based on the topics subscription.
1616

1717
## Prerequisites
1818

19-
* Start with the [Confluent platform setup](https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/doc/Confluent_Platform_Setup.md) because it gives you a complete environment to work with. If you do not wish to use Confluent Platform, then you need to install and configure Zookeeper, Apache Kafka, Kafka Connect, yourself. You will also need to install and configure the Azure Cosmos DB connectors manually.
19+
* Start with the [Confluent platform setup](https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/doc/Confluent_Platform_Setup.md) because it gives you a complete environment to work with. If you don't wish to use Confluent Platform, then you need to install and configure Zookeeper, Apache Kafka, Kafka Connect, yourself. You'll also need to install and configure the Azure Cosmos DB connectors manually.
2020
* Create an Azure Cosmos DB account, container [setup guide](https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/doc/CosmosDB_Setup.md)
2121
* Bash shell, which is tested on GitHub Codespaces, Mac, Ubuntu, Windows with WSL2. This shell doesn’t work in Cloud Shell or WSL1.
2222
* Download [Java 11+](https://www.oracle.com/java/technologies/javase-jdk11-downloads.html)
2323
* Download [Maven](https://maven.apache.org/download.cgi)
2424

2525
## Install sink connector
2626

27-
If you are using the recommended [Confluent platform setup](https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/doc/Confluent_Platform_Setup.md), the Azure Cosmos DB sink connector is included in the installation, and you can skip this step.
27+
If you're using the recommended [Confluent platform setup](https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/doc/Confluent_Platform_Setup.md), the Azure Cosmos DB sink connector is included in the installation, and you can skip this step.
2828

2929
Otherwise, you can download the JAR file from the latest [Release](https://github.com/microsoft/kafka-connect-cosmosdb/releases) or package this repo to create a new JAR file. To install the connector manually using the JAR file, refer to these [instructions](https://docs.confluent.io/current/connect/managing/install.html#install-connector-manually). You can also package a new JAR file from the source code.
3030

@@ -42,13 +42,13 @@ ls target/*dependencies.jar
4242

4343
## Create a Kafka topic and write data
4444

45-
If you are using the Confluent Platform, the easiest way to create a Kafka topic is by using the supplied Control Center UX. Otherwise, you can create a Kafka topic manually using the following syntax:
45+
If you're using the Confluent Platform, the easiest way to create a Kafka topic is by using the supplied Control Center UX. Otherwise, you can create a Kafka topic manually using the following syntax:
4646

4747
```bash
4848
./kafka-topics.sh --create --zookeeper <ZOOKEEPER_URL:PORT> --replication-factor <NO_OF_REPLICATIONS> --partitions <NO_OF_PARTITIONS> --topic <TOPIC_NAME>
4949
```
5050

51-
For this scenario, we will create a Kafka topic named “hotels” and will write non-schema embedded JSON data to the topic. To create a topic inside Control Center, see the [Confluent guide](https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html#step-2-create-ak-topics).
51+
For this scenario, we'll create a Kafka topic named “hotels” and will write non-schema embedded JSON data to the topic. To create a topic inside Control Center, see the [Confluent guide](https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html#step-2-create-ak-topics).
5252

5353
Next, start the Kafka console producer to write a few records to the “hotels” topic.
5454

@@ -76,9 +76,9 @@ The three records entered are published to the “hotels” Kafka topic in JSON
7676

7777
## Create the sink connector
7878

79-
Create the Azure Cosmos DB sink connector in Kafka Connect. The following JSON body defines config for the sink connector. Make sure to replace the values for `connect.cosmos.connection.endpoint` and `connect.cosmos.master.key`, properties that you should have saved from the Azure Cosmos DB setup guide in the prerequisites.
79+
Create an Azure Cosmos DB sink connector in Kafka Connect. The following JSON body defines config for the sink connector. Make sure to replace the values for `connect.cosmos.connection.endpoint` and `connect.cosmos.master.key`, properties that you should have saved from the Azure Cosmos DB setup guide in the prerequisites.
8080

81-
Refer to the [sink properties](#sink-configuration-properties) section for more information on each of these configuration properties.
81+
For more information on each of these configuration properties, see [sink properties](#sink-configuration-properties).
8282

8383
```json
8484
{
@@ -105,11 +105,11 @@ Once you have all the values filled out, save the JSON file somewhere locally. Y
105105

106106
### Create connector using Control Center
107107

108-
An easy option to create the connector is by going through the Control Center webpage. Follow this [installation guide](https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html#step-3-install-a-ak-connector-and-generate-sample-data) to create a connector from Control Center. Instead of using the `DatagenConnector` option, use the `CosmosDBSinkConnector` tile instead. When configuring the sink connector, fill out the values as you have filled in the JSON file.
108+
An easy option to create the connector is by going through the Control Center webpage. Follow this [installation guide](https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html#step-3-install-a-ak-connector-and-generate-sample-data) to create a connector from Control Center. Instead of using the `DatagenConnector` option, use the `CosmosDBSinkConnector` tile instead. When configuring the sink connector, fill out the values as you've filled in the JSON file.
109109

110110
Alternatively, in the connectors page, you can upload the JSON file created earlier by using the **Upload connector config file** option.
111111

112-
:::image type="content" source="./media/kafka-connector-sink/upload-connector-config.png" alt-text="Upload connector config.":::
112+
:::image type="content" source="./media/kafka-connector-sink/upload-sink-connector-config.png" lightbox="./media/kafka-connector-sink/upload-sink-connector-config.png" alt-text="Screenshot of 'Upload connector config file' option in the Browse connectors dialog.":::
113113

114114
### Create connector using REST API
115115

@@ -127,9 +127,9 @@ Sign into the [Azure portal](https://portal.azure.com/learn.docs.microsoft.com)
127127

128128
## Cleanup
129129

130-
To delete the connector from the Control Center, navigate to the sink connector you created and click the **Delete** icon.
130+
To delete the connector from the Control Center, navigate to the sink connector you created and select the **Delete** icon.
131131

132-
:::image type="content" source="./media/kafka-connector-sink/delete-connector.png" alt-text="Delete connector.":::
132+
:::image type="content" source="./media/kafka-connector-sink/delete-sink-connector.png" lightbox="./media/kafka-connector-sink/delete-sink-connector.png" alt-text="Screenshot of delete option in the sink connector dialog.":::
133133

134134
Alternatively, use the Connect REST API to delete:
135135

@@ -142,7 +142,7 @@ To delete the created Azure Cosmos DB service and its resource group using Azure
142142

143143
## <a id="sink-configuration-properties"></a>Sink configuration properties
144144

145-
The following settings are used to configure the Cosmos DB Kafka sink connector. These configuration values determine which Kafka topics data is consumed, which Azure Cosmos DB container’s data is written into, and formats to serialize the data. For an example configuration file with the default values, refer to [this config]( https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/src/docker/resources/sink.example.json).
145+
The following settings are used to configure an Azure Cosmos DB Kafka sink connector. These configuration values determine which Kafka topics data is consumed, which Azure Cosmos DB container’s data is written into, and formats to serialize the data. For an example configuration file with the default values, refer to [this config]( https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/src/docker/resources/sink.example.json).
146146

147147
| Name | Type | Description | Required/Optional |
148148
| :--- | :--- | :--- | :--- |
@@ -191,18 +191,18 @@ The sink Connector also supports the following AVRO logical types:
191191
192192
## Single Message Transforms(SMT)
193193

194-
Along with the sink connector settings, you can specify the use of Single Message Transformations (SMTs) to modify messages flowing through the Kafka Connect platform. For more information, refer to the [Confluent SMT Documentation](https://docs.confluent.io/platform/current/connect/transforms/overview.html).
194+
Along with the sink connector settings, you can specify the use of Single Message Transformations (SMTs) to modify messages flowing through the Kafka Connect platform. For more information, see [Confluent SMT Documentation](https://docs.confluent.io/platform/current/connect/transforms/overview.html).
195195

196196
### Using the InsertUUID SMT
197197

198-
You can use InsertUUID SMT to automatically add item IDs. With the custom `InsertUUID` SMT, you can insert the `id` field with a random UUID value for each message, before it is written to Azure Cosmos DB.
198+
You can use InsertUUID SMT to automatically add item IDs. With the custom `InsertUUID` SMT, you can insert the `id` field with a random UUID value for each message, before it's written to Azure Cosmos DB.
199199

200200
> [!WARNING]
201201
> Use this SMT only if the messages don’t contain the `id` field. Otherwise, the `id` values will be overwritten and you may end up with duplicate items in your database. Using UUIDs as the message ID can be quick and easy but are [not an ideal partition key](https://stackoverflow.com/questions/49031461/would-using-a-substring-of-a-guid-in-cosmosdb-as-partitionkey-be-a-bad-idea) to use in Azure Cosmos DB.
202202
203203
### Install the SMT
204204

205-
Before you can use the `InsertUUID` SMT, you will need to install this transform in your Confluent Platform setup. If you are using the Confluent Platform setup from this repo, the transform is already included in the installation, and you can skip this step.
205+
Before you can use the `InsertUUID` SMT, you'll need to install this transform in your Confluent Platform setup. If you're using the Confluent Platform setup from this repo, the transform is already included in the installation, and you can skip this step.
206206

207207
Alternatively, you can package the [InsertUUID source](https://github.com/confluentinc/kafka-connect-insert-uuid) to create a new JAR file. To install the connector manually using the JAR file, refer to these [instructions](https://docs.confluent.io/current/connect/managing/install.html#install-connector-manually).
208208

@@ -253,7 +253,7 @@ Here are solutions to some common problems that you may encounter when working w
253253

254254
### Read non-JSON data with JsonConverter
255255

256-
If you have non-JSON data on your source topic in Kafka and attempt to read it using the `JsonConverter`, you will see the following exception:
256+
If you have non-JSON data on your source topic in Kafka and attempt to read it using the `JsonConverter`, you'll see the following exception:
257257

258258
```console
259259
org.apache.kafka.connect.errors.DataException: Converting byte[] to Kafka Connect data failed due to serialization error:
@@ -273,7 +273,7 @@ This error is likely caused by data in the source topic being serialized in eith
273273

274274
### Read non-Avro data with AvroConverter
275275

276-
This scenario is applicable when you try to use the Avro converter to read data from a topic that is not in Avro format. Which, includes data written by an Avro serializer other than the Confluent Schema Registry’s Avro serializer, which has its own wire format.
276+
This scenario is applicable when you try to use the Avro converter to read data from a topic that isn't in Avro format. Which, includes data written by an Avro serializer other than the Confluent Schema Registry’s Avro serializer, which has its own wire format.
277277

278278
```console
279279
org.apache.kafka.connect.errors.DataException: my-topic-name
@@ -314,7 +314,7 @@ Kafka Connect supports a special structure of JSON messages containing both payl
314314
}
315315
```
316316

317-
If you try to read JSON data that does not contain the data in this structure, you will get the following error:
317+
If you try to read JSON data that doesn't contain the data in this structure, you'll get the following error:
318318

319319
```none
320320
org.apache.kafka.connect.errors.DataException: JsonConverter with schemas.enable requires "schema" and "payload" fields and may not contain additional fields. If you are trying to deserialize plain JSON data, set schemas.enable=false in your converter configuration.
@@ -329,7 +329,7 @@ To be clear, the only JSON structure that is valid for `schemas.enable=true` has
329329

330330
## Limitations
331331

332-
* Autocreation of databases and containers in Azure Cosmos DB are not supported. The database and containers must already exist, and they must be configured correctly.
332+
* Autocreation of databases and containers in Azure Cosmos DB aren't supported. The database and containers must already exist, and they must be configured correctly.
333333

334334
## Next steps
335335

articles/cosmos-db/sql/kafka-connector-source.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,26 +5,26 @@ author: kushagrathapar
55
ms.service: cosmos-db
66
ms.subservice: cosmosdb-sql
77
ms.topic: conceptual
8-
ms.date: 06/28/2021
8+
ms.date: 05/13/2022
99
ms.author: kuthapar
1010
---
1111

12-
# Kafka Connect for Azure Cosmos DB - Source connector
12+
# Kafka Connect for Azure Cosmos DB - source connector
1313
[!INCLUDE[appliesto-sql-api](../includes/appliesto-sql-api.md)]
1414

1515
Kafka Connect for Azure Cosmos DB is a connector to read from and write data to Azure Cosmos DB. The Azure Cosmos DB source connector provides the capability to read data from the Azure Cosmos DB change feed and publish this data to a Kafka topic.
1616

1717
## Prerequisites
1818

19-
* Start with the [Confluent platform setup](https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/doc/Confluent_Platform_Setup.md) because it gives you a complete environment to work with. If you do not wish to use Confluent Platform, then you need to install and configure Zookeeper, Apache Kafka, Kafka Connect, yourself. You will also need to install and configure the Azure Cosmos DB connectors manually.
19+
* Start with the [Confluent platform setup](https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/doc/Confluent_Platform_Setup.md) because it gives you a complete environment to work with. If you don't wish to use Confluent Platform, then you need to install and configure Zookeeper, Apache Kafka, Kafka Connect, yourself. You'll also need to install and configure the Azure Cosmos DB connectors manually.
2020
* Create an Azure Cosmos DB account, container [setup guide](https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/doc/CosmosDB_Setup.md)
2121
* Bash shell, which is tested on GitHub Codespaces, Mac, Ubuntu, Windows with WSL2. This shell doesn’t work in Cloud Shell or WSL1.
2222
* Download [Java 11+](https://www.oracle.com/java/technologies/javase-jdk11-downloads.html)
2323
* Download [Maven](https://maven.apache.org/download.cgi)
2424

2525
## Install the source connector
2626

27-
If you are using the recommended [Confluent platform setup](https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/doc/Confluent_Platform_Setup.md), the Azure Cosmos DB source connector is included in the installation, and you can skip this step.
27+
If you're using the recommended [Confluent platform setup](https://github.com/microsoft/kafka-connect-cosmosdb/blob/dev/doc/Confluent_Platform_Setup.md), the Azure Cosmos DB source connector is included in the installation, and you can skip this step.
2828

2929
Otherwise, you can use JAR file from latest [Release](https://github.com/microsoft/kafka-connect-cosmosdb/releases) and install the connector manually. To learn more, see these [instructions](https://docs.confluent.io/current/connect/managing/install.html#install-connector-manually). You can also package a new JAR file from the source code:
3030

@@ -42,7 +42,7 @@ ls target/*dependencies.jar
4242

4343
## Create a Kafka topic
4444

45-
Create a Kafka topic using Confluent Control Center. For this scenario, we will create a Kafka topic named "apparels" and write non-schema embedded JSON data to the topic. To create a topic inside the Control Center, see [create Kafka topic doc](https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html#step-2-create-ak-topics).
45+
Create a Kafka topic using Confluent Control Center. For this scenario, we'll create a Kafka topic named "apparels" and write non-schema embedded JSON data to the topic. To create a topic inside the Control Center, see [create Kafka topic doc](https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html#step-2-create-ak-topics).
4646

4747
## Create the source connector
4848

@@ -74,11 +74,11 @@ For more information on each of the above configuration properties, see the [sou
7474

7575
#### Create connector using Control Center
7676

77-
An easy option to create the connector is from the Confluent Control Center portal. Follow the [Confluent setup guide](https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html#step-3-install-a-ak-connector-and-generate-sample-data) to create a connector from Control Center. When setting up, instead of using the `DatagenConnector` option, use the `CosmosDBSourceConnector` tile instead. When configuring the source connector, fill out the values as you have filled in the JSON file.
77+
An easy option to create the connector is from the Confluent Control Center portal. Follow the [Confluent setup guide](https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html#step-3-install-a-ak-connector-and-generate-sample-data) to create a connector from Control Center. When setting up, instead of using the `DatagenConnector` option, use the `CosmosDBSourceConnector` tile instead. When configuring the source connector, fill out the values as you've filled in the JSON file.
7878

7979
Alternatively, in the connectors page, you can upload the JSON file built from the previous section by using the **Upload connector config file** option.
8080

81-
:::image type="content" source="./media/kafka-connector-source/upload-connector-config.png" alt-text="Upload connector config.":::
81+
:::image type="content" source="./media/kafka-connector-source/upload-source-connector-config.png" lightbox="./media/kafka-connector-source/upload-source-connector-config.png" alt-text="Screenshot of 'Upload connector config file' option in the Browse connectors dialog.":::
8282

8383
#### Create connector using REST API
8484

@@ -131,7 +131,7 @@ curl -H "Content-Type: application/json" -X POST -d @<path-to-JSON-config-file>
131131

132132
To delete the connector from the Confluent Control Center, navigate to the source connector you created and select the **Delete** icon.
133133

134-
:::image type="content" source="./media/kafka-connector-source/delete-source-connector.png" alt-text="Delete connector from Confluent center":::
134+
:::image type="content" source="./media/kafka-connector-source/delete-source-connector.png" lightbox="./media/kafka-connector-source/delete-source-connector.png" alt-text="Screenshot of delete option in the source connector dialog.":::
135135

136136
Alternatively, use the connector’s REST API:
137137

Binary file not shown.
84.9 KB
Loading
Binary file not shown.
104 KB
Loading
43.7 KB
Loading
Binary file not shown.
104 KB
Loading

0 commit comments

Comments
 (0)