Skip to content

Commit ff8142a

Browse files
committed
Update PR comments
Signed-off-by: AlvarVG <alvigo92@gmail.com>
1 parent 312073c commit ff8142a

File tree

1 file changed

+21
-17
lines changed

1 file changed

+21
-17
lines changed

_posts/2026-01-08-measuring-debezium-server-performance-mysql-streaming.adoc

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,10 @@ This post does not aim to compare Debezium Server with Kafka Connect nor to iden
1616
Instead, it focuses on understanding how Debezium Server behaves under realistic workloads using default configurations,
1717
and whether it can sustain real-time change data capture in lightweight architectures.
1818

19+
It is also important to underline that, in this evaluation, the database is not stressed in a fully real-world scenario.
20+
Conditions such as long-running transactions or other complex workload patterns are not represented,
21+
as incorporating them would significantly increase the complexity of the test case.
22+
1923
[id=introduction]
2024
== Introduction
2125

@@ -24,15 +28,15 @@ Performance measurement transforms software from a black box into an understanda
2428

2529
'What is not measured cannot be controlled, predicted or reliably improved'
2630

27-
Going beyond sayings and proverbs, we can make the following assertions about performance.
31+
Going beyond sayings and proverbs, performance measurement provides you with evidence to inform and drive architectural decisions.
32+
Specifically, performance testing can help you by providing information that enables you to:
2833

29-
- Validates that the system meets the requirements
30-
- Identifies bottlenecks and limiting factors
31-
- Enables predictable scalability
32-
- Supports data-driven architectural decisions
33-
- Reduces operational and infrastructure costs
34-
- Improves reliability and resilience
35-
- Builds trust with users and stakeholders
34+
- Validate that the system meets requirements
35+
- Identify bottlenecks and limiting factors
36+
- Help ensure predictable scalability
37+
- Reduce operational and infrastructure costs
38+
- Deliver improvements in reliability and resilience
39+
- Build trust with users and stakeholders
3640

3741
[id=debezium-server]
3842
== Debezium Server
@@ -41,7 +45,7 @@ Apart from performance tests, another great unknown in the CDC community is *Deb
4145
But let me introduce you to Debezium Server.
4246
Debezium Server is a standalone runtime for Change Data Capture that captures database changes using Debezium and delivers them directly to configurable sinks without requiring Kafka or Kafka Connect.
4347
It addresses a common gap in CDC architectures: situations where Kafka Connect, Kubernetes, or large distributed platforms introduce unnecessary operational complexity.
44-
It enables teams to adopt CDC with minimal infrastructure while still benefiting from Debezium’s mature and battle-tested connectors.
48+
It enables teams to adopt CDC with minimal infrastructure while still benefiting from Debezium’s mature and proven connectors.
4549

4650
[.centered-image.responsive-image]
4751
====
@@ -64,9 +68,9 @@ It runs as a single process and although Kafka is not required, it preserves the
6468
Now that we are familiar with all the components, let us dive into the technical details.
6569

6670
Starting with the hardware, we have deployed two EC2 machines in AWS of type "c5a.xlarge". Each instance provides 4 vCPUs and 8 GiB of memory, ensuring sufficient resources for both workload generation and CDC processing.
67-
We have called them Observed and Observer (in order to separate the monitoring load from the machine where Debezium and MySQL are running). In the Observer instance, we have deployed Prometheus, Grafana and YCSB
71+
We have called these two instances _Observed_ and _Observer_ (in order to separate the monitoring load from the machine where Debezium and MySQL are running). In the Observer instance, we have deployed Prometheus, Grafana and YCSB
6872
(the software piece generating the load). In the Observed instance, we have a single Kafka container (although it is not really necessary and another software can be used),
69-
Debezium Server and MySQL.
73+
Debezium Server, and MySQL.
7074

7175
[.centered-image.responsive-image]
7276
====
@@ -76,7 +80,7 @@ Debezium Server and MySQL.
7680
*Benchmarking architecture*
7781
====
7882

79-
As described above, we are using some software pieces, most of them are already known, except for YCSB. Yahoo! Cloud Serving Benchmark (YCSB), is a tool created by the Yahoo
83+
As per the preceding description, we are using several software pieces, most of which are already known, except for Yahoo! Cloud Serving Benchmark (YCSB). The YCSB tool, which was created by the Yahoo
8084
team and released into open source as of 04/23/2010, developed to evaluate the performance of different cloud serving stores. This is the link to their GitHub project: https://ycsb.site[https://ycsb.site].
8185

8286
If you want to reproduce this scenario, it is quite simple. Go to debezium-performance[ADD_REPO_URL_HERE] repo, clone it and cd into the infrastructure-automation folder and then run terraform apply.
@@ -120,11 +124,11 @@ And the other ones are the memory consumption in MB throughout the execution. Th
120124
*Throughput evolution during one execution*
121125
====
122126

123-
In another example, we have the Debezium Server throughput during one of the executions. This throughput chart is split into two parts, which comes from YCSB behavior.
127+
In this example, we see the Debezium Server throughput from another execution of the test. This throughput chart is split into two parts, which comes from YCSB behavior.
124128
YCSB executions are divided into two phases: an initial load phase that populates the tables, followed by a workload phase that executes the configured mix of operations. This behavior explains the two distinct phases visible in the throughput charts.
125129
If the first half keeps the throughput at the set level, we consider the execution successful (OK/1), otherwise if something in the process makes it fail we consider it a failed execution (NOK/0).
126130

127-
In the picture below, you can see a detailed view in the second phase of the execution where we are running 300 updates per second and 300 inserts per second.
131+
The image that follows shows a detailed view of the second phase of the execution, which runs 300 updates per second and 300 inserts per second.
128132

129133
[.centered-image.responsive-image]
130134
====
@@ -137,8 +141,8 @@ In the picture below, you can see a detailed view in the second phase of the exe
137141
[id=results-analysis]
138142
== Results and Analysis
139143

140-
Let me now show you the results. By examining these results, it is possible to gain a clear view of how the system behaves under varying levels of load and
141-
to identify trends that may not be evident from isolated measurements. But before getting into detailed analysis, you can find the raw results here[ADD_REPO_URL_HERE_RESULTS_FILE].
144+
Let's now examine the results to gain a clear view of how the system behaves under varying levels of load, and
145+
to identify trends that might not be evident from isolated measurements. If you want to review the detailed analysis, you can view the raw results [ADD_REPO_URL_HERE_RESULTS_FILE].
142146

143147
[.centered-image.responsive-image]
144148
====
@@ -148,7 +152,7 @@ to identify trends that may not be evident from isolated measurements. But befor
148152
*Debezium Server streaming results screenshot for MySQL*
149153
====
150154

151-
With the image above in mind, let's understand what each of the columns means.
155+
With the preceding image in mind, let's consider the meaning of each column.
152156

153157
- TABLE_RECORDS: Desired table records we are going to work with in the execution.
154158
- TABLE_TARGET_OPS: Desired operations per second we are going to perform during the execution.

0 commit comments

Comments
 (0)