You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2026-01-08-measuring-debezium-server-performance-mysql-streaming.adoc
+21-17Lines changed: 21 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,10 @@ This post does not aim to compare Debezium Server with Kafka Connect nor to iden
16
16
Instead, it focuses on understanding how Debezium Server behaves under realistic workloads using default configurations,
17
17
and whether it can sustain real-time change data capture in lightweight architectures.
18
18
19
+
It is also important to underline that, in this evaluation, the database is not stressed in a fully real-world scenario.
20
+
Conditions such as long-running transactions or other complex workload patterns are not represented,
21
+
as incorporating them would significantly increase the complexity of the test case.
22
+
19
23
[id=introduction]
20
24
== Introduction
21
25
@@ -24,15 +28,15 @@ Performance measurement transforms software from a black box into an understanda
24
28
25
29
'What is not measured cannot be controlled, predicted or reliably improved'
26
30
27
-
Going beyond sayings and proverbs, we can make the following assertions about performance.
31
+
Going beyond sayings and proverbs, performance measurement provides you with evidence to inform and drive architectural decisions.
32
+
Specifically, performance testing can help you by providing information that enables you to:
28
33
29
-
- Validates that the system meets the requirements
30
-
- Identifies bottlenecks and limiting factors
31
-
- Enables predictable scalability
32
-
- Supports data-driven architectural decisions
33
-
- Reduces operational and infrastructure costs
34
-
- Improves reliability and resilience
35
-
- Builds trust with users and stakeholders
34
+
- Validate that the system meets requirements
35
+
- Identify bottlenecks and limiting factors
36
+
- Help ensure predictable scalability
37
+
- Reduce operational and infrastructure costs
38
+
- Deliver improvements in reliability and resilience
39
+
- Build trust with users and stakeholders
36
40
37
41
[id=debezium-server]
38
42
== Debezium Server
@@ -41,7 +45,7 @@ Apart from performance tests, another great unknown in the CDC community is *Deb
41
45
But let me introduce you to Debezium Server.
42
46
Debezium Server is a standalone runtime for Change Data Capture that captures database changes using Debezium and delivers them directly to configurable sinks without requiring Kafka or Kafka Connect.
43
47
It addresses a common gap in CDC architectures: situations where Kafka Connect, Kubernetes, or large distributed platforms introduce unnecessary operational complexity.
44
-
It enables teams to adopt CDC with minimal infrastructure while still benefiting from Debezium’s mature and battle-tested connectors.
48
+
It enables teams to adopt CDC with minimal infrastructure while still benefiting from Debezium’s mature and proven connectors.
45
49
46
50
[.centered-image.responsive-image]
47
51
====
@@ -64,9 +68,9 @@ It runs as a single process and although Kafka is not required, it preserves the
64
68
Now that we are familiar with all the components, let us dive into the technical details.
65
69
66
70
Starting with the hardware, we have deployed two EC2 machines in AWS of type "c5a.xlarge". Each instance provides 4 vCPUs and 8 GiB of memory, ensuring sufficient resources for both workload generation and CDC processing.
67
-
We have called them Observed and Observer (in order to separate the monitoring load from the machine where Debezium and MySQL are running). In the Observer instance, we have deployed Prometheus, Grafana and YCSB
71
+
We have called these two instances _Observed_ and _Observer_ (in order to separate the monitoring load from the machine where Debezium and MySQL are running). In the Observer instance, we have deployed Prometheus, Grafana and YCSB
68
72
(the software piece generating the load). In the Observed instance, we have a single Kafka container (although it is not really necessary and another software can be used),
69
-
Debezium Server and MySQL.
73
+
Debezium Server, and MySQL.
70
74
71
75
[.centered-image.responsive-image]
72
76
====
@@ -76,7 +80,7 @@ Debezium Server and MySQL.
76
80
*Benchmarking architecture*
77
81
====
78
82
79
-
As described above, we are using some software pieces, most of them are already known, except for YCSB. Yahoo! Cloud Serving Benchmark (YCSB), is a tool created by the Yahoo
83
+
As per the preceding description, we are using several software pieces, most of which are already known, except for Yahoo! Cloud Serving Benchmark (YCSB). The YCSB tool, which was created by the Yahoo
80
84
team and released into open source as of 04/23/2010, developed to evaluate the performance of different cloud serving stores. This is the link to their GitHub project: https://ycsb.site[https://ycsb.site].
81
85
82
86
If you want to reproduce this scenario, it is quite simple. Go to debezium-performance[ADD_REPO_URL_HERE] repo, clone it and cd into the infrastructure-automation folder and then run terraform apply.
@@ -120,11 +124,11 @@ And the other ones are the memory consumption in MB throughout the execution. Th
120
124
*Throughput evolution during one execution*
121
125
====
122
126
123
-
In another example, we have the Debezium Server throughput during one of the executions. This throughput chart is split into two parts, which comes from YCSB behavior.
127
+
In this example, we see the Debezium Server throughput from another execution of the test. This throughput chart is split into two parts, which comes from YCSB behavior.
124
128
YCSB executions are divided into two phases: an initial load phase that populates the tables, followed by a workload phase that executes the configured mix of operations. This behavior explains the two distinct phases visible in the throughput charts.
125
129
If the first half keeps the throughput at the set level, we consider the execution successful (OK/1), otherwise if something in the process makes it fail we consider it a failed execution (NOK/0).
126
130
127
-
In the picture below, you can see a detailed view in the second phase of the execution where we are running 300 updates per second and 300 inserts per second.
131
+
The image that follows shows a detailed view of the second phase of the execution, which runs 300 updates per second and 300 inserts per second.
128
132
129
133
[.centered-image.responsive-image]
130
134
====
@@ -137,8 +141,8 @@ In the picture below, you can see a detailed view in the second phase of the exe
137
141
[id=results-analysis]
138
142
== Results and Analysis
139
143
140
-
Let me now show you the results. By examining these results, it is possible to gain a clear view of how the system behaves under varying levels of load and
141
-
to identify trends that may not be evident from isolated measurements. But before getting into detailed analysis, you can find the raw results here[ADD_REPO_URL_HERE_RESULTS_FILE].
144
+
Let's now examine the resultsto gain a clear view of how the system behaves under varying levels of load, and
145
+
to identify trends that might not be evident from isolated measurements. If you want to review the detailed analysis, you can view the raw results [ADD_REPO_URL_HERE_RESULTS_FILE].
142
146
143
147
[.centered-image.responsive-image]
144
148
====
@@ -148,7 +152,7 @@ to identify trends that may not be evident from isolated measurements. But befor
148
152
*Debezium Server streaming results screenshot for MySQL*
149
153
====
150
154
151
-
With the image above in mind, let's understand what each of the columns means.
155
+
With the preceding image in mind, let's consider the meaning of each column.
152
156
153
157
- TABLE_RECORDS: Desired table records we are going to work with in the execution.
154
158
- TABLE_TARGET_OPS: Desired operations per second we are going to perform during the execution.
0 commit comments