|
12 | 12 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
13 | 13 | // See the License for the specific language governing permissions and |
14 | 14 | // limitations under the License. |
15 | | -= Cross-cluster Replication Extension |
16 | | - |
17 | | -== Overview |
18 | | -link:https://github.com/apache/ignite-extensions/tree/master/modules/cdc-ext[Cross-cluster Replication Extension] module provides the following ways to set up cross-cluster replication based on CDC. |
19 | | - |
20 | | -. link:https://github.com/apache/ignite-extensions/blob/master/modules/cdc-ext/src/main/java/org/apache/ignite/cdc/thin/IgniteToIgniteClientCdcStreamer.java[Ignite2IgniteClientCdcStreamer] - streams changes to destination cluster using link:thin-clients/java-thin-client[Java Thin Client]. |
21 | | -. link:https://github.com/apache/ignite-extensions/blob/master/modules/cdc-ext/src/main/java/org/apache/ignite/cdc/IgniteToIgniteCdcStreamer.java[Ignite2IgniteCdcStreamer] - streams changes to destination cluster using client node. |
22 | | -. link:https://github.com/apache/ignite-extensions/blob/master/modules/cdc-ext/src/main/java/org/apache/ignite/cdc/kafka/IgniteToKafkaCdcStreamer.java[Ignite2KafkaCdcStreamer] combined with link:https://github.com/apache/ignite-extensions/blob/master/modules/cdc-ext/src/main/java/org/apache/ignite/cdc/kafka/KafkaToIgniteCdcStreamer.java[KafkaToIgniteCdcStreamer] streams changes to destination cluster using link:https://kafka.apache.org[Apache Kafka] as a transport. |
23 | | - |
24 | | -NOTE: Conflict resolver should be defined for each cache replicated between the clusters. |
25 | | - |
26 | | -NOTE: All implementations of the cross-cluster replication support replication of link:https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/binary/BinaryType.html[BinaryTypes] and link:https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cdc/TypeMapping.html[TypeMappings] |
27 | | - |
28 | | -NOTE: To use SQL queries on the destination cluster over CDC-replicated data, set the same `VALUE_TYPE` in |
29 | | -link:sql-reference/ddl#create-table[CREATE TABLE] on both source and destination clusters for each table. |
30 | | - |
31 | | -== Ignite to Java Thin Client CDC streamer |
32 | | -This streamer starts link:thin-clients/java-thin-client[Java Thin Client] which connects to destination cluster. |
33 | | -After connection is established, all changes captured by CDC will be replicated to destination cluster. |
34 | | - |
35 | | -NOTE: Instances of `ignite-cdc.sh` with configured streamer should be started on each server node of source cluster to capture all changes. |
36 | | - |
37 | | -image:../../assets/images/integrations/CDC-ignite2igniteClient.svg[] |
38 | | - |
39 | | -=== Configuration |
40 | | - |
41 | | -[cols="20%,45%,35%",opts="header"] |
42 | | -|=== |
43 | | -|Name |Description | Default value |
44 | | -| `caches` | Set of cache names to replicate. | null |
45 | | -| `destinationClientConfiguration` | Client configuration of thin client that will connect to destination cluster to replicate changes. | null |
46 | | -| `onlyPrimary` | Flag to handle changes only on primary node. | `false` |
47 | | -| `maxBatchSize` | Maximum number of events to be sent to destination cluster in a single batch. | 1024 |
48 | | -|=== |
49 | | - |
50 | | -=== Metrics |
51 | | - |
52 | | -[cols="25%,75%",opts="header"] |
53 | | -|=== |
54 | | -|Name |Description |
55 | | -| `EventsCount` | Count of messages applied to destination cluster. |
56 | | -| `LastEventTime` | Timestamp of last applied event to destination cluster. |
57 | | -| `TypesCount` | Count of binary types events applied to destination cluster. |
58 | | -| `MappingsCount` | Count of mappings events applied to destination cluster |
59 | | -|=== |
60 | | - |
61 | | -== Ignite to Ignite CDC streamer |
62 | | -This streamer starts client node which connects to destination cluster. |
63 | | -After connection is established, all changes captured by CDC will be replicated to destination cluster. |
64 | | - |
65 | | -NOTE: Instances of `ignite-cdc.sh` with configured streamer should be started on each server node of source cluster to capture all changes. |
66 | | - |
67 | | -image:../../assets/images/integrations/CDC-ignite2ignite.svg[] |
68 | | - |
69 | | -=== Configuration |
70 | | - |
71 | | -[cols="20%,45%,35%",opts="header"] |
72 | | -|=== |
73 | | -|Name |Description | Default value |
74 | | -| `caches` | Set of cache names to replicate. | null |
75 | | -| `destinationIgniteConfiguration` | Ignite configuration of client nodes that will connect to destination cluster to replicate changes. | null |
76 | | -| `onlyPrimary` | Flag to handle changes only on primary node. | `false` |
77 | | -| `maxBatchSize` | Maximum number of events to be sent to destination cluster in a single batch. | 1024 |
78 | | -|=== |
79 | | - |
80 | | -=== Metrics |
81 | | - |
82 | | -[cols="25%,75%",opts="header"] |
83 | | -|=== |
84 | | -|Name |Description |
85 | | -| `EventsCount` | Count of messages applied to destination cluster. |
86 | | -| `LastEventTime` | Timestamp of last applied event to destination cluster. |
87 | | -| `TypesCount` | Count of binary types events applied to destination cluster. |
88 | | -| `MappingsCount` | Count of mappings events applied to destination cluster |
89 | | -|=== |
| 15 | += Cross-cluster Replication with Kafka |
90 | 16 |
|
91 | 17 | == CDC replication using Kafka |
92 | 18 |
|
@@ -175,22 +101,6 @@ It should be just enough to process source cluster load. |
175 | 101 | Each instance of application will process configured subset of topic partitions to spread the load. |
176 | 102 | `KafkaConsumer` for each partition will be created to ensure fair reads. |
177 | 103 |
|
178 | | -==== Installation |
179 | | - |
180 | | -. Build `cdc-ext` module with maven: |
181 | | -+ |
182 | | -```console |
183 | | - $~/src/ignite-extensions/> mvn clean package -DskipTests |
184 | | - $~/src/ignite-extensions/> ls modules/cdc-ext/target | grep zip |
185 | | -ignite-cdc-ext.zip |
186 | | -``` |
187 | | - |
188 | | -. Unpack `ignite-cdc-ext.zip` archive to `$IGNITE_HOME` folder. |
189 | | - |
190 | | -Now, you have additional binary `$IGNITE_HOME/bin/kafka-to-ignite.sh` and `$IGNITE_HOME/libs/optional/ignite-cdc-ext` module. |
191 | | - |
192 | | -NOTE: Please, enable `ignite-cdc-ext` to be able to run `kafka-to-ignite.sh`. |
193 | | - |
194 | 104 | ==== Configuration |
195 | 105 |
|
196 | 106 | Application configuration should be done using POJO classes or Spring xml file like regular Ignite node configuration. |
@@ -264,76 +174,3 @@ NOTE: link:https://kafka.apache.org/documentation/#consumerconfigs_request.timeo |
264 | 174 | == Fault tolerance |
265 | 175 | It expected that CDC streamers will be configured with the `onlyPrimary=false` in most real-world deployments to ensure fault-tolerance. |
266 | 176 | That means streamer will send the same change several times equal to `CacheConfiguration#backups` + 1. |
267 | | - |
268 | | -== Conflict resolution |
269 | | -Conflict resolver should be defined for each cache replicated between the clusters. |
270 | | -Cross-cluster replication extension has the link:https://github.com/apache/ignite-extensions/blob/master/modules/cdc-ext/src/main/java/org/apache/ignite/cdc/conflictresolve/CacheVersionConflictResolverImpl.java[default] conflict resolver implementation. |
271 | | - |
272 | | -NOTE: Default implementation only select correct entry and never merge. |
273 | | - |
274 | | -The default resolver implementation will be used when custom conflict resolver is not set. |
275 | | - |
276 | | -=== Configuration |
277 | | - |
278 | | -[cols="20%,45%,35%",opts="header"] |
279 | | -|=== |
280 | | -|Name |Description | Default value |
281 | | -| `clusterId` | Local cluster id. Can be any value from 1 to 31. | null |
282 | | -| `caches` | Set of cache names to handle with this plugin instance. | null |
283 | | -| `conflictResolveField` | Value field to resolve conflict with. Optional. Field values must implement `java.lang.Comparable`. | null |
284 | | -| `conflictResolver` | Custom conflict resolver. Optional. Field must implement `CacheVersionConflictResolver`. | null |
285 | | -|=== |
286 | | - |
287 | | -=== Conflict resolution algorithm |
288 | | -Replicated changes contain some additional data. Specifically, entry's version from source cluster is supplied with the changed data. |
289 | | -Default conflict resolve algorithm based on entry version and `conflictResolveField`. |
290 | | - |
291 | | -==== Conflict resolution based on the entry's version |
292 | | -This approach provides the eventual consistency guarantee when each entry is updatable only from a single cluster. |
293 | | - |
294 | | -IMPORTANT: This approach does not replicate any updates or removals from the destination cluster to the source cluster. |
295 | | - |
296 | | -.Algorithm: |
297 | | -.. Changes from the "local" cluster are always win. Any replicated data can be overridden locally. |
298 | | -.. If both old and new entry are from the same cluster then entry versions comparison is used to determine the order. |
299 | | -.. Conflict resolution failed. Update will be ignored. Failure will be logged. |
300 | | - |
301 | | -==== Conflict resolution based on the entry's value field |
302 | | -This approach provides the eventual consistency guarantee even when entry is updatable from any cluster. |
303 | | - |
304 | | -NOTE: Conflict resolution field, specified by `conflictResolveField`, should contain a user provided monotonically increasing value such as query id or timestamp. |
305 | | - |
306 | | -IMPORTANT: This approach does not replicate the removals from the destination cluster to the source cluster, because removes can't be versioned by the field. |
307 | | - |
308 | | -.Algorithm: |
309 | | -.. Changes from the "local" cluster are always win. Any replicated data can be overridden locally. |
310 | | -.. If both old and new entry are from the same cluster then entry versions comparison is used to determine the order. |
311 | | -.. If `conflictResolveField` is provided then field values comparison is used to determine the order. |
312 | | -.. Conflict resolution failed. Update will be ignored. Failure will be logged. |
313 | | - |
314 | | -==== Custom conflict resolution rules |
315 | | -You're able to define your own rules for resolving conflicts based on the nature of your data and operations. |
316 | | -This can be particularly useful in more complex situations where the standard conflict resolution strategies do not apply. |
317 | | - |
318 | | -Choosing the right conflict resolution strategy depends on your specific use case and requires a good understanding of your data and its usage. |
319 | | -You should consider the nature of your transactions, the rate of change of your data, and the implications of potential data loss or overwrites when selecting a conflict resolution strategy. |
320 | | - |
321 | | -Custom conflict resolver can be set via `conflictResolver` and allows to compare or merge the conflict data in any required way. |
322 | | - |
323 | | -=== Configuration example |
324 | | -Configuration is done via Ignite node plugin: |
325 | | - |
326 | | -```xml |
327 | | -<property name="pluginProviders"> |
328 | | - <bean class="org.apache.ignite.cdc.conflictresolve.CacheVersionConflictResolverPluginProvider"> |
329 | | - <property name="clusterId" value="1" /> |
330 | | - <property name="caches"> |
331 | | - <util:list> |
332 | | - <bean class="java.lang.String"> |
333 | | - <constructor-arg type="String" value="queryId" /> |
334 | | - </bean> |
335 | | - </util:list> |
336 | | - </property> |
337 | | - </bean> |
338 | | -</property> |
339 | | -``` |
0 commit comments