Skip to content

Commit f147b5b

Browse files
committed
Update more PR comments
Signed-off-by: AlvarVG <alvigo92@gmail.com>
1 parent ff8142a commit f147b5b

File tree

3 files changed

+51
-79
lines changed

3 files changed

+51
-79
lines changed

_posts/2026-01-08-measuring-debezium-server-performance-mysql-streaming.adoc

Lines changed: 51 additions & 79 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ as incorporating them would significantly increase the complexity of the test ca
2626
Measuring software performance is important. It directly affects reliability, scalability, cost efficiency and user trust. It is a prerequisite for making informed engineering decisions.
2727
Performance measurement transforms software from a black box into an understandable, predictable and optimizable system. It is well known that:
2828

29-
'What is not measured cannot be controlled, predicted or reliably improved'
29+
_"What is not measured cannot be controlled, predicted or reliably improved"_
3030

3131
Going beyond sayings and proverbs, performance measurement provides you with evidence to inform and drive architectural decisions.
3232
Specifically, performance testing can help you by providing information that enables you to:
@@ -41,24 +41,23 @@ Specifically, performance testing can help you by providing information that ena
4141
[id=debezium-server]
4242
== Debezium Server
4343

44-
Apart from performance tests, another great unknown in the CDC community is *Debezium Server*. If you are reading this, you probably know something about Debezium and CDC.
44+
Apart from performance tests, another great unknown in the CDC community might be *Debezium Server*. If you are reading this, you probably know something about Debezium and CDC.
4545
But let me introduce you to Debezium Server.
4646
Debezium Server is a standalone runtime for Change Data Capture that captures database changes using Debezium and delivers them directly to configurable sinks without requiring Kafka or Kafka Connect.
4747
It addresses a common gap in CDC architectures: situations where Kafka Connect, Kubernetes, or large distributed platforms introduce unnecessary operational complexity.
4848
It enables teams to adopt CDC with minimal infrastructure while still benefiting from Debezium’s mature and proven connectors.
4949

50-
[.centered-image.responsive-image]
51-
====
5250
++++
53-
<img src="/assets/images/debezium-server-architecture.png" style="max-width:70%;" class="responsive-image" alt="Debezium Server Architecture">
51+
<div class="imageblock centered-image">
52+
<img src="/assets/images/debezium-server-architecture.png" class="responsive-image" alt="Debezium Server Architecture">
53+
</div>
5454
++++
55-
*Debezium Server architecture*
56-
====
5755

5856
Debezium Server is built on top of the Debezium Engine, packaging it into a production-ready application that can run as a single JVM process.
5957
It runs as a single process and although Kafka is not required, it preserves the core CDC guarantees:
60-
- Exactly-once offset management (per connector)
61-
- Ordered event delivery per table
58+
59+
- At-least-once offset management (per connector)
60+
- Ordered event delivery per table (depends on the connector)
6261
- Snapshotting and streaming of the changes in the database
6362
- Rich metadata about transactions and schemas
6463

@@ -72,28 +71,24 @@ We have called these two instances _Observed_ and _Observer_ (in order to separa
7271
(the software piece generating the load). In the Observed instance, we have a single Kafka container (although it is not really necessary and another software can be used),
7372
Debezium Server, and MySQL.
7473

75-
[.centered-image.responsive-image]
76-
====
7774
++++
78-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/benchmark_architecture_blueprint.png" style="max-width:70%;" class="responsive-image" alt="Benchmarking architecture">
75+
<div class="imageblock centered-image">
76+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/benchmark_architecture_blueprint.png" class="responsive-image" alt="Benchmarking architecture">
77+
</div>
7978
++++
80-
*Benchmarking architecture*
81-
====
8279

8380
As per the preceding description, we are using several software pieces, most of which are already known, except for Yahoo! Cloud Serving Benchmark (YCSB). The YCSB tool, which was created by the Yahoo
8481
team and released into open source as of 04/23/2010, developed to evaluate the performance of different cloud serving stores. This is the link to their GitHub project: https://ycsb.site[https://ycsb.site].
8582

86-
If you want to reproduce this scenario, it is quite simple. Go to debezium-performance[ADD_REPO_URL_HERE] repo, clone it and cd into the infrastructure-automation folder and then run terraform apply.
83+
If you want to reproduce this scenario, it is quite simple. Go to debezium-performance[ADD_REPO_URL_HERE] repo, clone it and cd into the infrastructure-automation folder and then run `terraform apply`.
8784
This will automatically create the EC2 instances, provision them with ansible (copy files and install docker) and start the scenario.
8885
Logging in Grafana will allow you to see how everything evolves and progresses (inside Grafana you will need to navigate to the MySQL Streaming dashboard).
8986

90-
[.centered-image.responsive-image]
91-
====
9287
++++
93-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/benchmark_completed_scenario.png" style="max-width:70%;" class="responsive-image" alt="Streaming dashboard from a completed scenario">
88+
<div class="imageblock centered-image">
89+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/benchmark_completed_scenario.png" class="responsive-image" alt="Streaming dashboard from a completed scenario">
90+
</div>
9491
++++
95-
*Streaming dashboard from a completed scenario*
96-
====
9792

9893
With the technology stack clearly defined and reproducible, we can now focus on how performance will be measured and which metrics will be considered:
9994

@@ -105,52 +100,44 @@ With the technology stack clearly defined and reproducible, we can now focus on
105100
For all the given metrics, we are calculating the average over the duration of the test. This way, we provide a stable value, and any possible spikes that may exist do not affect
106101
the final result and are included in it. These are a few more detailed panels of the dashboard.
107102

108-
[.centered-image.responsive-image]
109-
====
110103
++++
111-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/cpu_memory_consumption.png" style="max-width:70%;" class="responsive-image" alt="CPU and Memory consumption during one execution">
104+
<div class="imageblock centered-image">
105+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/cpu_memory_consumption.png" class="responsive-image" alt="CPU and Memory consumption during one execution">
106+
</div>
112107
++++
113-
*CPU and Memory consumption during one execution*
114-
====
115108

116109
In the picture above, you can see a few of the panels related to the resource consumption. The two on top are related to the CPU usage, which is the calculated percentage of usage for a 4 vCPUs instance.
117110
And the other ones are the memory consumption in MB throughout the execution. The two on the left represent the evolution over time, and the ones on the right are the all-time average.
118111

119-
[.centered-image.responsive-image]
120-
====
121112
++++
122-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/throughput_time_evolution.png" style="max-width:70%;" class="responsive-image" alt="Throughput evolution during one execution">
113+
<div class="imageblock centered-image">
114+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/throughput_time_evolution.png" class="responsive-image" alt="Throughput evolution during one execution">
115+
</div>
123116
++++
124-
*Throughput evolution during one execution*
125-
====
126117

127118
In this example, we see the Debezium Server throughput from another execution of the test. This throughput chart is split into two parts, which comes from YCSB behavior.
128119
YCSB executions are divided into two phases: an initial load phase that populates the tables, followed by a workload phase that executes the configured mix of operations. This behavior explains the two distinct phases visible in the throughput charts.
129120
If the first half keeps the throughput at the set level, we consider the execution successful (OK/1), otherwise if something in the process makes it fail we consider it a failed execution (NOK/0).
130121

131122
The image that follows shows a detailed view of the second phase of the execution, which runs 300 updates per second and 300 inserts per second.
132123

133-
[.centered-image.responsive-image]
134-
====
135124
++++
136-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/throughput_detailed_view.png" style="max-width:70%;" class="responsive-image" alt="Detailed view of the throughput panel">
125+
<div class="imageblock centered-image">
126+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/throughput_detailed_view.png" class="responsive-image" alt="Detailed view of the throughput panel">
127+
</div>
137128
++++
138-
*Detailed view of the throughput panel*
139-
====
140129

141130
[id=results-analysis]
142131
== Results and Analysis
143132

144133
Let's now examine the results to gain a clear view of how the system behaves under varying levels of load, and
145134
to identify trends that might not be evident from isolated measurements. If you want to review the detailed analysis, you can view the raw results [ADD_REPO_URL_HERE_RESULTS_FILE].
146135

147-
[.centered-image.responsive-image]
148-
====
149136
++++
150-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/debezium_server_streaming_mysql_results.png" style="max-width:70%;" class="responsive-image" alt="Debezium Server streaming results screenshot for MySQL">
137+
<div class="imageblock centered-image">
138+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/debezium_server_streaming_mysql_results.png" class="responsive-image" alt="Debezium Server streaming results screenshot for MySQL">
139+
</div>
151140
++++
152-
*Debezium Server streaming results screenshot for MySQL*
153-
====
154141

155142
With the preceding image in mind, let's consider the meaning of each column.
156143

@@ -181,13 +168,11 @@ This chart does not follow a single uniform growth pattern solely driven by the
181168
this behavior is not consistent across all tables. Several tables show relatively high execution times even at lower operation counts, whereas others maintain low durations despite handling more operations.
182169
This indicates that execution time is strongly influenced by table-specific factors rather than by workload size alone.
183170

184-
[.centered-image.responsive-image]
185-
====
186171
++++
187-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/duration_by_ops_table.png" style="max-width:70%;" class="responsive-image" alt="Duration by operation per second and table amount">
172+
<div class="imageblock centered-image">
173+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/duration_by_ops_table.png" class="responsive-image" alt="Duration by operation per second and table amount">
174+
</div>
188175
++++
189-
*Duration by operation per second and table amount*
190-
====
191176

192177
This chart shows the average CPU utilization. It does not scale linearly with the number of operations and varies across tables.
193178
In the initial table groups, CPU increases as the workload grows, reaching peak values at a mid-range operation count, after which it stabilizes or even decreases despite higher numbers of operations.
@@ -196,80 +181,67 @@ For subsequent tables, CPU usage remains within a relatively narrow range, with
196181
CPU utilization is influenced more by table-specific processing characteristics and execution efficiency than by total operation volume alone.
197182
Other factors - I/O wait, batching or internal optimizations - may play a more significant role in performance at higher workloads.
198183

199-
[.centered-image.responsive-image]
200-
====
201184
++++
202-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/cpu_by_ops_table.png" style="max-width:70%;" class="responsive-image" alt="CPU usage by operation per second and table amount">
185+
<div class="imageblock centered-image">
186+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/cpu_by_ops_table.png" class="responsive-image" alt="CPU usage by operation per second and table amount">
187+
</div>
203188
++++
204-
*CPU usage by operation per second and table amount*
205-
====
206189

207190
The memory consumption remains stable across different operation counts and tables, with no clear relationship to the number of operations executed. Most measurements cluster within a similar range, suggesting a consistent baseline memory footprint largely independent of workload size.
208-
The chart suggests that memory usage is dominated by structural or configuration-related factors rather than by the volume of operations, and that memory is unlikely to be the primary scaling constraint.
191+
The chart suggests that memory usage is dominated by logical database schema or configuration-related factors rather than by the volume of operations, and that memory is unlikely to be the primary scaling constraint.
209192

210-
[.centered-image.responsive-image]
211-
====
212193
++++
213-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/memory_by_ops_table.png" style="max-width:70%;" class="responsive-image" alt="Memory usage by operation per second and table amount">
194+
<div class="imageblock centered-image">
195+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/memory_by_ops_table.png" class="responsive-image" alt="Memory usage by operation per second and table amount">
196+
</div>
214197
++++
215-
*Memory usage by operation per second and table amount*
216-
====
217198

218199
The network usage varies significantly across tables and operations counts, with output traffic consistently higher than input traffic in all cases. In the initial table groups, both input and output volumes increase as the number of operations grows, indicating a clear workload-driven effect on network utilization.
219200
However, this pattern becomes less consistent in later tables, where network usage remains high or fluctuates despite changes in total operations. This suggests that data transfer is influenced more by table-specific characteristics than by the number of operations.
220201

221202
Overall, this indicates that network utilization is a dominant and variable factor in system behavior, particularly on the output side, and should be considered a potential constraint depending on table structure and data change patterns.
222203

223-
[.centered-image.responsive-image]
224-
====
225204
++++
226-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/network_by_ops_table.png" style="max-width:70%;" class="responsive-image" alt="Network usage by operation per second and table amount">
205+
<div class="imageblock centered-image">
206+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/network_by_ops_table.png" class="responsive-image" alt="Network usage by operation per second and table amount">
207+
</div>
227208
++++
228-
*Network usage by operation per second and table amount*
229-
====
230209

231210
Disk write activity remains relatively stable across most tables and operation counts, indicating a consistent write pattern independent of workload size.
232211
In contrast, disk read activity is minimal or negligible in the early table groups and becomes more prominent only in later tables, where noticeable spikes appear at specific operation counts.
233212
One table in particular exhibits a pronounced write peak compared to all others, suggesting an isolated behavior such as increased flushing or checkpointing.
234213
This behavior is typically observed during log file rotation or log exchanges, where disk activity is driven primarily by the number of log files generated and the overall execution duration, rather than by the volume of operations.
235214

236-
[.centered-image.responsive-image]
237-
====
238215
++++
239-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/disk_by_ops_table.png" style="max-width:70%;" class="responsive-image" alt="Disk usage by operation per second and table amount">
216+
<div class="imageblock centered-image">
217+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/disk_by_ops_table.png" class="responsive-image" alt="Disk usage by operation per second and table amount">
218+
</div>
240219
++++
241-
*Disk usage by operation per second and table amount*
242-
====
243-
244220

245221
The maximum throughput achieved varies noticeably across tables, indicating different upper performance limits depending on table characteristics.
246222
Tables 1 and 2 reach the highest values, demonstrating a greater capacity to sustain higher operation rates, while subsequent tables show a gradual reduction in maximum achievable throughput.
247223
This decrease is not strictly linear, as some tables outperform others despite their position, but it clearly reflects table-specific constraints such as data structure, indexing or processing complexity.
248224

249225
The chart highlights that maximum operations per second are primarily driven by table design and access patterns rather than by uniform system limits, reinforcing the need for table-level performance evaluation.
250226

251-
[.centered-image.responsive-image]
252-
====
253227
++++
254-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/max_ops_by_table.png" style="max-width:70%;" class="responsive-image" alt="Maximum operation per second by table amount">
228+
<div class="imageblock centered-image">
229+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/max_ops_by_table.png" class="responsive-image" alt="Maximum operation per second by table amount">
230+
</div>
255231
++++
256-
*Maximum operation per second by table amount*
257-
====
258232

259233
But, *what is Debezium Server's observed upper bound in number of operations per second?*
260234

261235
In this comparison we can see the relationship between MySQL and Debezium Server throughput. The initial peak observed at the start of the execution is the desired value we are targeting for this execution.
262236
MySQL fails to keep pace at such high values (1500 operations per second in 1 table). On the Debezium side, the throughput closely mirrors MySQL activity.
263-
This correlation confirms that the CDC pipeline is processing database changes efficiently and in near real time, with no apparent backlog or loss.
264-
Overall throughput behavior is driven by the database write pattern rather than by downstream processing constraints.
237+
In the lower panel, you can see that Debezium's internal queue maintains a high and stable capacity throughout the execution, with no indication of saturation or backlog accumulation.
238+
Overall, the chart confirms that Debezium Server processes MySQL changes efficiently and in near real time, and that the throughput behavior is driven by the database write pattern rather than by downstream processing constraints.
265239

266-
[.centered-image.responsive-image]
267-
====
268240
++++
269-
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/mysql_vs_debezium_server_throughput.png" style="max-width:70%;" class="responsive-image" alt="MySQL vs Debezium Server throughput comparison">
241+
<div class="imageblock centered-image">
242+
<img src="/assets/images/2026-01-08-measuring-debezium-server-performance-mysql-streaming/mysql_vs_debezium_server_throughput.png" class="responsive-image" alt="MySQL vs Debezium Server throughput comparison">
243+
</div>
270244
++++
271-
*MySQL vs Debezium Server throughput comparison*
272-
====
273245

274246
[id=conclusion-recommendations]
275247
== Conclusion and recommendations
-85.4 KB
Loading
71.1 KB
Loading

0 commit comments

Comments
 (0)