Merge branch 'main' of https://github.com/advancedmysql/The-Art-of-Problem-Solving-in-Software-Engineering_How-to-Make-MySQL-Better

wangbin579 · wangbin579 · commit f110564ac895 · 2024-09-02T13:50:59.000+08:00
diff --git a/Chapter1.md b/Chapter1.md
@@ -62,7 +62,7 @@ Meta's case study demonstrates that the new solution based on the Raft protocol
 
 With the transaction isolation level set to Read Committed, simulations based on Group Replication were conducted under various network latency conditions.
 
-The deployment setup of Group Replication is illustrated as follows: On machine A, two MySQL instances are deployed�one serving as the primary and the other as the secondary. These two instances form the majority and communicate via localhost. Machine B hosts a third instance deployed as a member of the cluster, with a network latency of X milliseconds.
+The deployment setup of Group Replication is illustrated as follows: On machine A, two MySQL instances are deployed—one serving as the primary and the other as the secondary. These two instances form the majority and communicate via localhost. Machine B hosts a third instance deployed as a member of the cluster, with a network latency of X milliseconds.
 
 ![](media/792556dcff5f267dcc5aefeb5ef0d035.png)
 
diff --git a/Chapter10.md b/Chapter10.md
@@ -10,7 +10,7 @@ Whether using asynchronous replication, semisynchronous replication, or Group Re
 
 ### 10.1.1 Ensuring Replay Correctness
 
-To ensure correct replay, it is necessary to establish dependencies between transactions. If there are conflicts between two transactions, the replay order must be determined�specifically, which transaction should be replayed first and which should follow. These dependencies are based on the transaction order in the binlog or relay log files.
+To ensure correct replay, it is necessary to establish dependencies between transactions. If there are conflicts between two transactions, the replay order must be determined—specifically, which transaction should be replayed first and which should follow. These dependencies are based on the transaction order in the binlog or relay log files.
 
 Once the dependencies are established, ensuring the idempotence of replay is crucial. This property is essential, especially in scenarios like crash recovery, to guarantee that transactions can be replayed correctly and consistently without unintended side effects.
 
@@ -125,7 +125,7 @@ In terms of performance, the queue model for MySQL secondary replay can be simpl
 
 Figure 10-5. The queue model for MySQL secondary replay.
 
-In MySQL secondary replay, multi-queue stages�such as for relay log flushing, transaction event replay (including reading, parsing, and queueing events), and commit operations�restrict the theoretical maximum replay speed. These serialized processes create inherent limits on how quickly the replay can proceed.
+In MySQL secondary replay, multi-queue stages—such as for relay log flushing, transaction event replay (including reading, parsing, and queueing events), and commit operations—restrict the theoretical maximum replay speed. These serialized processes create inherent limits on how quickly the replay can proceed.
 
 ## 10.2 Root Cause Analysis of Slow MySQL Replay
 
@@ -248,7 +248,7 @@ bool Mts_submode_logical_clock::wait_for_last_committed_trx(
 }
 ```
 
-The code describes a mechanism where the SQL thread waits if the recorded low-water-mark (LWM)�which signifies that a transaction and all prior transactions have been committed�is less than the last committed value of the transaction being replayed. In MySQL, it is the SQL thread that waits, rather than the worker threads. This waiting mechanism significantly restricts the replay speed.
+The code describes a mechanism where the SQL thread waits if the recorded low-water-mark (LWM)—which signifies that a transaction and all prior transactions have been committed—is less than the last committed value of the transaction being replayed. In MySQL, it is the SQL thread that waits, rather than the worker threads. This waiting mechanism significantly restricts the replay speed.
 
 Finally, let's examine the problems related to MySQL secondary replay in a NUMA environment. The following figure shows the test results of MySQL secondary replay:
 
@@ -453,7 +453,7 @@ Reducing the size of the binlog theoretically helps improve MySQL replay speed.
 
 Figure 10-23. Achieve better replay speed with binlog_row_image=minimal.
 
-When using full mode for binlog, MySQL achieves a balanced replay speed of just over 790,000 tpmC. Switching to minimal mode, however, increases this speed to over 890,000 tpmC, representing a significant 13% improvement. This improvement highlights that setting *binlog_row_image=minimal*�which substantially reduces the binlog size�boosts the replay speed of MySQL secondaries. However, it's important to note that this setting may also pose a risk of incomplete data restoration in certain scenarios.
+When using full mode for binlog, MySQL achieves a balanced replay speed of just over 790,000 tpmC. Switching to minimal mode, however, increases this speed to over 890,000 tpmC, representing a significant 13% improvement. This improvement highlights that setting *binlog_row_image=minimal*—which substantially reduces the binlog size—boosts the replay speed of MySQL secondaries. However, it's important to note that this setting may also pose a risk of incomplete data restoration in certain scenarios.
 
 ### 10.3.8 Impact of Performance Schema on Replay Performance
 
diff --git a/Chapter3.md b/Chapter3.md
@@ -26,7 +26,7 @@ When a problem recurs, it often reveals underlying characteristics, facilitating
 
 ### 3.1.4 Strategies to Increase Reproducibility
 
-Many problems are environment-specific, often manifesting sporadically, especially under high concurrency. The challenge lies in addressing these infrequent occurrences, which may happen just once every few months. Increasing the frequency of problem reproduction�from every few months to every few hours or minutes�significantly simplifies their resolution.
+Many problems are environment-specific, often manifesting sporadically, especially under high concurrency. The challenge lies in addressing these infrequent occurrences, which may happen just once every few months. Increasing the frequency of problem reproduction—from every few months to every few hours or minutes—significantly simplifies their resolution.
 
 How can this be achieved? Capturing patterns in problem occurrence is crucial. For example, when addressing simultaneous failures in Group Replication that sporadically freeze views, analyzing statistical patterns reveals critical insights. These problems often cluster around specific thresholds. Adjusting lower-level communication timeout settings to align with network interruption durations enables more frequent problem reproduction. Once these critical factors are understood, the likelihood of reproducing problems increases significantly, laying a solid foundation for effective problem resolution.