diff --git a/docs/install/install.asciidoc b/docs/install/install.asciidoc index d4cccba531b..fcf6bf83cba 100644 --- a/docs/install/install.asciidoc +++ b/docs/install/install.asciidoc @@ -1,30 +1,27 @@ -There are several ways to install and use Debezium connectors, so we've documented a few of the most common ways to do this. -The latest development version of Debezium is {install-version}. +:toclevels: 5 +:awestruct-layout: doc +:toc: +:toc-placement: macro +:sectanchors: +:linkattrs: +:icons: font +:source-highlighter: highlight.js -== Installing a Debezium Connector +toc::[] -If you've already installed https://zookeeper.apache.org[Zookeeper], http://kafka.apache.org/[Kafka], and http://kafka.apache.org/documentation.html#connect[Kafka Connect], then using one of Debezium's connectors is easy. -Simply download one or more connector plugin archives (see below), extract their files into your Kafka Connect environment, and add the parent directory of the extracted plugin(s) to https://docs.confluent.io/current/connect/userguide.html#installing-plugins[Kafka Connect's plugin path]. -If not the case yet, specify the plugin path in your worker configuration (e.g. _connect-distributed.properties_) using the `plugin.path` property. -As an example, let's assume you have downloaded the Debezium MySQL connector archive and extracted its contents to _/kafka/connect/debezium-connector-mysql_. -Then you'd specify the following in the worker config: +== 1. Overview -[source] ----- -plugin.path=/kafka/connect ----- +There are primarily two ways of installing the Debezium connectors, either installing and using a Kafka Connect environment or using Debezium Docker images. -Restart your Kafka Connect process to pick up the new JARs. +.*_Kafka Connect environment_* +If you've already installed https://zookeeper.apache.org[Zookeeper], http://kafka.apache.org/[Kafka], and http://kafka.apache.org/documentation.html#connect[Kafka Connect], +then using one of Debezium's connectors is easy. Simply download one or more connector plugin archives (see below), extract their files into your Kafka Connect environment, +and add the parent directory of the extracted plugin(s) to https://docs.confluent.io/current/connect/userguide.html#installing-plugins[Kafka Connect's plugin path]. -The connector plugins are available from Maven: - -* https://repo1.maven.org/maven2/io/debezium/debezium-connector-mysql/{install-version}/debezium-connector-mysql-{install-version}-plugin.tar.gz[MySQL Connector plugin archive] -* https://repo1.maven.org/maven2/io/debezium/debezium-connector-postgres/{install-version}/debezium-connector-postgres-{install-version}-plugin.tar.gz[Postgres Connector plugin archive] -* https://repo1.maven.org/maven2/io/debezium/debezium-connector-mongodb/{install-version}/debezium-connector-mongodb-{install-version}-plugin.tar.gz[MongoDB Connector plugin archive] -* https://repo1.maven.org/maven2/io/debezium/debezium-connector-sqlserver/{install-version}/debezium-connector-sqlserver-{install-version}-plugin.tar.gz[SQL Server Connector plugin archive] -* https://repo1.maven.org/maven2/io/debezium/debezium-connector-oracle/{install-version}/debezium-connector-oracle-{install-version}-plugin.tar.gz[Oracle Connector plugin archive] (tech preview) - -If immutable containers are your thing, then check out https://hub.docker.com/r/debezium/[Debezium's Docker images] for Zookeeper, Kafka, and Kafka Connect with the MySQL and MongoDB connectors already pre-installed and ready to go. Our link:http://debezium.io/docs/tutorial[tutorial] even walks you through using these images, and this is a great way to learn what Debezium is all about. You can even link:/docs/openshift[run Debezium on OpenShift]. +.*_Debezium Docker image_* +If immutable containers are your thing, then check out https://hub.docker.com/r/debezium/[Debezium's Docker images] for Zookeeper, Kafka, and Kafka Connect +with the MySQL and MongoDB connectors already pre-installed and ready to go. Our link:http://debezium.io/docs/tutorial[tutorial] even walks you through using these images, +and this is a great way to learn what Debezium is all about. You can even link:/docs/openshift[run Debezium on OpenShift]. By default, the directory _/kafka/connect_ is used as plugin directory by the Debezium Docker image for Kafka Connect. So any additional connectors you may wish to use should be added to that directory. @@ -32,16 +29,26 @@ Alternatively, you can add further directories to the plugin path by specifying (e.g. `-e KAFKA_CONNECT_PLUGINS_DIR=/kafka/connect/,/path/to/further/plugins`). When using the Docker image for Kafka Connect provided by Confluent, you can specify the `CONNECT_PLUGIN_PATH` environment variable to achieve the same. -Not that Java 8 or later is required to run the Debezium connectors. +NOTE: Note that Java 8 or later is required to run the Debezium connectors. -=== Consuming Snapshot Releases +[[maven-connectors]] +The latest development version of Debezium is {install-version}, and the connector plugins are available from Maven: +* https://repo1.maven.org/maven2/io/debezium/debezium-connector-mysql/{install-version}/debezium-connector-mysql-{install-version}-plugin.tar.gz[MySQL Connector plugin archive] +* https://repo1.maven.org/maven2/io/debezium/debezium-connector-postgres/{install-version}/debezium-connector-postgres-{install-version}-plugin.tar.gz[Postgres Connector plugin archive] +* https://repo1.maven.org/maven2/io/debezium/debezium-connector-mongodb/{install-version}/debezium-connector-mongodb-{install-version}-plugin.tar.gz[MongoDB Connector plugin archive] +* https://repo1.maven.org/maven2/io/debezium/debezium-connector-sqlserver/{install-version}/debezium-connector-sqlserver-{install-version}-plugin.tar.gz[SQL Server Connector plugin archive] +* https://repo1.maven.org/maven2/io/debezium/debezium-connector-oracle/{install-version}/debezium-connector-oracle-{install-version}-plugin.tar.gz[Oracle Connector plugin archive] (tech preview) + +.*_Consuming Snapshot Releases_* +[NOTE] +===== Debezium executes nightly builds and deployments into the Sonatype snapshot repository. If you want to try latest and fresh or verify a bug fix you are interested in, then use plugins from https://oss.sonatype.org/content/repositories/snapshots/io/debezium/[oss.sonatype.org]. The installation procedure is the same as for regular releases. +===== -== Using a Debezium Connector - +.*_Using a Debezium Connector_* To use a connector to produce change events for a particular source server/cluster, simply create a configuration file for the link:/docs/connectors/mysql/#deploying-a-connector[MySQL Connector], link:/docs/connectors/postgresql/#deploying-a-connector[Postgres Connector], @@ -54,7 +61,521 @@ for each inserted, updated, and deleted row or document. See the Debezium link:/docs/connectors[Connectors] documentation for more information. -== Configuring Debezium Topics + +[[kafka-confluent-inst]] +== 2. Installing at Kafka Connect environment + +The Debezium connector can be installed and deployed either at the https://kafka.apache.org/[plain Kafka Connect] +or at the https://www.confluent.io/product/confluent-platform/[Confluent Platform]. + +[discrete] +[[kafka-inst]] +==== _Kafka installation_ + +*Download* the https://kafka.apache.org/downloads[Kafka archive] (kafka_2.12-2.1.1 is used) +and extract it into a directory (e.g. `/opt`) + +[source,bash] +---- +$ tar xzf kafka_2.12-2.1.1.tgz -C /opt +---- + +NOTE: In the rest of the document, the Kafka installation directory is referred as ``. + +[discrete] +[[confluent-inst]] +==== _Confluent Platform installation_ + +*Download* the Confluent open source https://docs.confluent.io/current/installation/installing_cp/zip-tar.html[archive file] +and extract it into a directory (e.g. `/opt`) + +[source,bash] +---- +$ curl -O http://packages.confluent.io/archive/5.1/confluent-community-5.1.2-2.11.tar.gz +$ tar xzf confluent-community-5.1.2-2.11.tar.gz -C /opt +---- + +NOTE: In the rest of the document, the Confluent installation directory is referred as ``. + +[[installing-debezium-connector]] +=== 2.1 Installing Connectors + +Before starting the installation, the Kafka Connect worker's https://docs.confluent.io/current/connect/userguide.html#connect-configuring-workers[plugin path] +should be set. The plugin path is a comma-separated list of paths to directories that contain Kafka Connect plugins. It is defined at the Kafka Connect configuration +file via the `plugin.path` parameter. + +The configuration file has two flavors depending on the Kafka Connect execution mode, the https://kafka.apache.org/documentation/#connect_running[standalone] and +the https://kafka.apache.org/documentation/#connect_running[distributed]. At the *standalone* mode, +the `connect-standalone.properties` file is used, whereas at the *distributed*, the `connect-distributed.properties` file is consider. +The files are located under the directories `/config` and `/etc/kafka` for the plain Kafka Connect +and the Confluent Platform installations respectively. + +In order to show both standalone and distributed execution modes, the link:#inst-debezium-plain-kafka-connect[plain Kafka Connect] will be started at standalone mode +whereas the link:#inst-debezium-confluent-platform[Confluent Platform] will be started at distributed mode (which is the default one). + +[NOTE] +===== +More information about Kafka Connectors can be found at: + +* http://kafka.apache.org/documentation/#connect[Kafka Connect] +* https://docs.confluent.io/current/connect/managing/index.html[Confluent Connect managing] +* https://docs.confluent.io/current/connect/userguide.html[Confluent Connect user guide] +===== + +[[inst-debezium-plain-kafka-connect]] +==== 2.1.1 Plain Kafka Connect + +* *Open* the `/config/connect-standalone.properties` and set the `plugin.path` parameter appropriately. + +* *Download* link:#maven-connectors[the latest released version] +of Debezium Connector and extract it under the Kafka Connect worker's plugin path as defined previously. + +[[inst-debezium-confluent-platform]] +==== 2.1.2 Using Confluent Platform + +* *Open* the `/etc/kafka/connect-distributed.properties` and set the `plugin.path` parameter appropriately +(the default value of the parameter points to `/share/java`). + +* *Download* link:#maven-connectors[the latest released version] +of Debezium Connector and extract it under the Kafka Connect worker's plugin path as defined previously. + +[[deploying-debezium-connector]] +=== 2.2 Configuring and Deploying Connectors + +https://docs.confluent.io/current/connect/managing/configuring.html[Connector configurations] are key-value mappings used to set up connectors. +For standalone mode, these are defined in a properties file and passed to the Connect process on the command line. +In distributed mode, they will be included in the JSON payload sent over the REST API. + +Generally, the Debezium configuration file includes parameters related to + +* the database connectivity (`database.hostname`, `database.port`, `database.user`, `database.password`), +* the kafka message key and value format (`key.converter`, `value.converter`), +* what data should be included at the kafka message (`key.converter.schemas.enable`, `value.converter.schemas.enable`), +* the message structure (link:/docs/configuration/event-flattening/[Event Flattening]). + +IMPORTANT: The current example is based on PostgreSQL database setup and configuration as described +at the link:/docs/install/postgres-plugins/[_Postgresql, logical decoding output plugin installation_] + +Here are some parameters with their values used in the current example, you can modify the values according to your needs: + +* `*name*` : `dbz-test-connector`, _the logical name of the connector_ +* `*connector.class*` : `io.debezium.connector.postgresql.PostgresConnector`, _the Debezium Postgresql connector class_ +* `*plugin.name*` : `wal2json`, _the used logical decoding output plugin_ +* `*key.converter*` : `org.apache.kafka.connect.json.JsonConverter`, _the appropriate converter to serialize kafka message key as JSON_ +* `*value.converter*` : `org.apache.kafka.connect.json.JsonConverter`, _the appropriate converter to serialize kafka message value as JSON_ +* `*database.dbname*` : `test`, _the name of the PostgreSQL database from which to stream the changes_ +* `*database.server.name*` : `DBTestServer`, _the logical name that identifies and provides a namespace for the particular PostgreSQL database server/cluster being monitored_ + +NOTE: Details for the configuration properties of the supported Debezium Connectors can be found at the https://debezium.io/docs/connectors/[Debezium documentation]. + +[TIP] +===== +Before starting, it is recommended to have installed the +https://stedolan.github.io/jq/[jq]. It is a lightweight and flexible command-line JSON processor +like `sed` for JSON data. You can use it to slice, filter, map and transform structured data +as you can see at the https://stedolan.github.io/jq/tutorial/[tutorial]. +===== + +[[deploying-debezium-plain-kafka]] +==== 2.2.1 Plain Kafka Connect + +At the current example, the following configuration for the Debezium Connector is used. Modify the parameter values, if needed, +and save the configuration into a file (e.g. `/config/dbz-test-connector.properties`). + +.Plain kafka Connect, Debezium Connector's configuration +[source,properties] +---- +name=dbz-test-connector +connector.class=io.debezium.connector.postgresql.PostgresConnector +tasks.max=1 +plugin.name=wal2json +database.hostname=localhost +database.port=5432 +database.user=postgres +database.password=password +database.dbname =test +database.server.name=DBTestServer +key.converter=org.apache.kafka.connect.json.JsonConverter +value.converter=org.apache.kafka.connect.json.JsonConverter +key.converter.schemas.enable=false +value.converter.schemas.enable=false +---- + +* *Start* the zookeeper server + +[source,bash] +---- +$ cd +$ bin/zookeeper-server-start.sh config/zookeeper.properties +---- + +* *Start* the Kafka server + +[source,bash] +---- +$ bin/kafka-server-start.sh config/server.properties +---- + +* *Deploy* the Debezium connector + +[source,bash] +---- +$ bin/connect-standalone.sh config/connect-standalone.properties config/dbz-test-connector.properties +---- + +* *Check* the Debezium connector status + +[source,bash] +---- +$ curl -s localhost:8083/connectors/dbz-test-connector/status | jq +---- + +.*_Debezium Connector's status output_* +[source,json] +---- +{ + "name": "dbz-test-connector", + "connector": { + "state": "RUNNING", + "worker_id": "127.0.0.1:8083" + }, + "tasks": [ + { + "state": "RUNNING", + "id": 0, + "worker_id": "127.0.0.1:8083" + } + ], + "type": "source" +} +---- + + +NOTE: More information about Kafka Rest API can be found at the official http://kafka.apache.org/documentation/#connect_rest[documentation]. + +[[deploying-debezium-confluent-platform]] +==== 2.2.2 Using Confluent Platform + +In case of using the Confluent Platform, the configuration parameters of the Debezium Connector is the same as the deployment +at the link:#deploying-debezium-plain-kafka[plain Kafka Connect]. The only difference is that the parameters are represented in JSON format. +Modify the parameter values, if needed, and save the configuration into a file (e.g. `/etc/kafka/dbz-test-connector.json`). + +.*_Confluent Platform, Debezium Connector's configuration_* +[source,json] +---- +{ + "name": "dbz-test-connector", + "config": { + "connector.class": "io.debezium.connector.postgresql.PostgresConnector", + "tasks.max": "1", + "plugin.name": "wal2json", + "database.hostname": "localhost", + "database.port": "5432", + "database.user": "postgres", + "database.password": "password", + "database.dbname" : "test", + "database.server.name": "DBTestServer", + "key.converter": "org.apache.kafka.connect.json.JsonConverter", + "value.converter": "org.apache.kafka.connect.json.JsonConverter", + "key.converter.schemas.enable":"false", + "value.converter.schemas.enable": "false" + } +} +---- + +Use the https://docs.confluent.io/current/cli/index.html[Confluent CLI] +to https://docs.confluent.io/current/cli/command-reference/confluent-start.html#cli-confluent-start[start] the Confluent platform services +and https://docs.confluent.io/current/cli/command-reference/confluent-load.html[load] the Debezium Connector. + +* *Start* Confluent Platform services + +[source,bash] +---- +$ cd +$ bin/confluent start +---- + +* *Deploy* the Debezium Connector + +[source,bash] +---- +$ bin/confluent load dbz-test-connector -d /etc/kafka/dbz-test-connector.json +---- + +.*_Debezium Connector's deployment output_* +[source,json] +---- +{ + "name": "dbz-test-connector", + "config": { + "connector.class": "io.debezium.connector.postgresql.PostgresConnector", + "tasks.max": "1", + "plugin.name": "wal2json", + "database.hostname": "localhost", + "database.port": "5432", + "database.user": "postgres", + "database.password": "password", + "database.dbname": "test", + "database.server.name": "DBTestServer", + "key.converter": "org.apache.kafka.connect.json.JsonConverter", + "value.converter": "org.apache.kafka.connect.json.JsonConverter", + "key.converter.schemas.enable": "false", + "value.converter.schemas.enable": "false", + "name": "dbz-test-connector" + }, + "tasks": [], + "type": null +} +---- + +* *Check* the Debezium connector status + +[source,bash] +---- +$ bin/confluent status dbz-test-connector +---- + +.*_Debezium Connector's status output_* +[source,json] +---- +{ + "name": "dbz-test-connector", + "connector": { + "state": "RUNNING", + "worker_id": "127.0.0.1:8083" + }, + "tasks": [ + { + "state": "RUNNING", + "id": 0, + "worker_id": "127.0.0.1:8083" + } + ], + "type": "source" +} +---- + +NOTE: More information about Confluent CLI can be found at the official https://docs.confluent.io/current/cli/command-reference/[documentation]. + +[[debezium-connector-test]] +=== 2.3 Testing Connectors + +The Debezium PostgreSQL connector writes events for all insert, update, and delete operations on a single table to a single Kafka topic. +The https://debezium.io/docs/connectors/postgresql/#topic-names[name of the Kafka topics] takes by default the form +*serverName.schemaName.tableName*, where *serverName* is the logical name of the connector as specified with the *`database.server.name`* +configuration property, *schemaName* is the name of the database schema where the operation occurred, and *tableName* is the name of +the database table on which the operation occurred. In our case, the name of the created kafka topic is +`DBTestServer.public.test_table`. + +Most PostgreSQL servers are configured to not retain the complete history of the database in the WAL segments, so the PostgreSQL +connector would be unable to see the entire history of the database by simply reading the WAL. So, by default the connector will +upon first startup perform an initial consistent https://debezium.io/docs/connectors/postgresql/#snapshots[snapshot] +of the database. + +[TIP] +===== +For the needs of the tests, it is recommended to use the https://github.com/edenhill/kafkacat[kafkacat], +a command line utility that helps to https://docs.confluent.io/current/app-development/kafkacat-usage.html[test and debug Apache Kafka deployments]. +It can be used to produce, consume, and list topic and partition information for Kafka. +You can download the https://github.com/edenhill/kafkacat/releases[latest version] and installed it +by following the instructions described at the https://github.com/edenhill/kafkacat/blob/master/README.md[documentation]. + +In the rest of the document, the kafkacat installation directory is referred as ``. +===== + +In order to check if the Debezium connector works as expected the following tests can be performed: + +* *Check the topic(s) creation* for the database table(s) + +Verify the creation of kafka topics for the tables that the connector is applied for (`test_table` in our example) + +[source,bash] +---- +$ /kafkacat -b localhost:9092 -L | grep DBTestServer +---- + +Alternatively the https://kafka.apache.org/documentation/#quickstart_createtopic[kafka-topics] Kafka command line tool +can be used for the link:#deploying-debezium-plain-kafka[plain Kafka Connect] and link:#deploying-debezium-confluent-platform[Confluent Platform] +deployments as follows: + +.*_Confluent Platform, `kafka-topics` command_* +[source,bash] +---- +$ /bin/kafka-topics.sh --list --zookeeper localhost:2181 | grep DBTestServer +---- + +.*_Plain Kafka Connector, `kafka-topics` command_* +[source,bash] +---- +$ /bin/kafka-topics --list --zookeeper localhost:2181 | grep DBTestServer +---- + +The output of the above command should include a topic named `DBTestServer.public.test_table`. + +* *Check* the *initial* topic(s) *content* + +Check the Kafka topic messages (the `DBTestServer.public.test_table` topic in our case) of the respective table, +an initial snapshot of the database should be contained (the output is formatted in order to be more readable) + +[source,bash] +---- +$ /kafkacat -b localhost:9092 -t DBTestServer.public.test_table -C -o beginning -f 'Key: %k\nValue: %s\n' +---- +[source,json] +---- +Key: {"id":"id1 "} +Value: { + "before":null, + "after":{"id":"id1 ","code":"code2 "}, + "source":{ + "version":"{debezium-version}", + "name":"DBTestServer", + "db":"test", + "ts_usec":1537191190816000, + "txId":934261, + "lsn":3323094832, + "schema":"public", + "table":"test_table", + "snapshot":true, + "last_snapshot_record":false}, + "op":"r", + "ts_ms":1537191190817 +} +% Reached end of topic DBTestServer.public.test_table [0] at offset 1 +---- + +[[debezium-connector-test-kafka-console-consumer]] +Alternatively the https://kafka.apache.org/quickstart#quickstart_consume[kafka-console-consumer] Kafka command line tool can be used, +for the link:#deploying-debezium-plain-kafka[plain Kafka Connect] and link:#deploying-debezium-confluent-platform[Confluent Platform] +deployments as follows: + +.*_Plain Kafka Connector, `kafka-console-consumer` command_* +[source,bash] +---- +$ /bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic DBTestServer.public.test_table --from-beginning --property print.key=true +---- + +.*_Confluent Platform, `kafka-console-consumer` command_* +[source,bash] +---- +$ /bin/kafka-console-consumer --bootstrap-server localhost:9092 --topic DBTestServer.public.test_table --from-beginning --property print.key=true +---- +Indeed, the connect took an initial database snapshot (the `test_table` contains only one record) + +* *Monitor the kafka messages* produced on table(s) changes + +Monitor the messages added at the kafka topic when the respective table changes (e.g. on insert/update and deletion record) +[source,bash] +---- +$ /kafkacat -b localhost:9092 -t DBTestServer.public.test_table -C -o beginning -f 'Key: %k\nValue: %s\n' +---- + +Alternatively the `kafka-console-consumer` Kafka command line tool can be used as described link:#debezium-connector-test-kafka-console-consumer[previously]. + +Here are the DML operations at the `test_table` and the respective kafka messages added at `DBTestServer.public.test_table` topic +(the messages are formatted in order to be more readable) + +.*_Insert a record_# +[source,sql] +---- +test=# INSERT INTO test_table (id, code) VALUES('id2', 'code2'); +---- + +.*_Insert a record - Kafka message_* +[source,json,indent=0] +---- +Key: {"id":"id2 "} +Value: { + "before":null, + "after":{"id":"id2 ","code":"code2 "}, + "source":{ + "version":"{debezium-version}", + "name":"DBTestServer", + "db":"test", + "ts_usec":1537262994443180000, + "txId":934262, + "lsn":3323107556, + "schema":"public", + "table":"test_table", + "snapshot":true, + "last_snapshot_record":true}, + "op":"c", + "ts_ms":1537262994604 +} +---- + +.*_Update a record_* +[source,sql] +---- +test=# update test_table set code='code3' where id='id2'; +---- + +.*_Update a record - Kafka message_* +[source,json] +---- +Key: {"id":"id2 "} +Value: { + "before":{"id":"id2 ","code":null}, + "after":{"id":"id2 ","code":"code3 "}, + "source":{ + "version":"{debezium-version}", + "name":"DBTestServer", + "db":"test", + "ts_usec":1537263061440799000, + "txId":934263, + "lsn":3323108190, + "schema":"public", + "table":"test_table", + "snapshot":true, + "last_snapshot_record":true}, + "op":"u", + "ts_ms":1537263061474 +} +---- + +.*_Delete a record_* +[source,sql] +---- +test=# delete from test_table where id='id2'; +---- + +.*_Delete a record - Kafka message_* +[source,json] +---- +Key: {"id":"id2 "} +Value: { + "before":{"id":"id2 ","code":null}, + "after":null, + "source":{ + "version":"{debezium-version}", + "name":"DBTestServer", + "db":"test", + "ts_usec":1537263155358693000, + "txId":934264, + "lsn":3323108208, + "schema":"public", + "table":"test_table", + "snapshot":true, + "last_snapshot_record":true}, + "op":"d", + "ts_ms":1537263155374} +---- + +.*_An extra message is added when a record is deleted, the tombstone message_* +[source,json] +---- +Key: {"id":"id2 "} +Value: +---- + +Debezium’s PostgreSQL connector always follows the delete event with a special tombstone event that +has the same key but *null value* in order to remove all messages with same key during +https://kafka.apache.org/documentation/#compaction[kafka log compaction]. +This behavior can be controled via the connector parameter +https://debezium.io/docs/connectors/postgresql/#connector-properties[tombstones.on.delete]. + +== 4. Configuring Topics + Debezium uses (either via Kafka Connect or directly) multiple topics for storing data. The topics have to be either created by an administrator or by Kafka itself by enabling auto-creation for topics. There are certain limitations and recommendations which apply to topics: @@ -72,10 +593,15 @@ specifically, these values should be larger than the maximum downtime you antici e.g. when updating them ** Replicated in production ** Single partition -*** You can relax the single partition rule but your application must handle out-of-order events for different rows in database (events for a single row are still totally ordered). If multiple partitions are used, Kafka will determine the partition by hashing the key by default. Other partition strategies require using SMTs to set the partition number for each record. +*** You can relax the single partition rule but your application must handle out-of-order events for different rows in database +(events for a single row are still totally ordered). If multiple partitions are used, Kafka will determine the partition by hashing the key by default. +Other partition strategies require using SMTs to set the partition number for each record. -== Using the Debezium Libraries +== 5. Using the Debezium Libraries Although Debezium is intended to be used as turnkey services, all of JARs and other artifacts are available in http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22io.debezium%22[Maven Central]. -We do provide a small library so applications can link:/docs/embedded/[embed any Kafka Connect connector] and consume data change events read directly from the source system. This provides a light weight system (since Zookeeper, Kafka, and Kafka Connect services are not needed), but as a consequence it is not as fault tolerant or reliable since the application must manage and maintain all state normally kept inside Kafka's distributed and replicated logs. It's perfect for use in tests, and with careful consideration it may be useful in some applications. +We do provide a small library so applications can link:/docs/embedded/[embed any Kafka Connect connector] and consume data change events read directly from the source system. +This provides a light weight system (since Zookeeper, Kafka, and Kafka Connect services are not needed), but as a consequence it is not as fault tolerant or reliable +since the application must manage and maintain all state normally kept inside Kafka's distributed and replicated logs. +It's perfect for use in tests, and with careful consideration it may be useful in some applications.