Release Date: 2025-11-25 Type: Patch Release (Critical Bug Fixes) Previous Version: v1.3.1
Version 1.3.2 is a critical patch release that fixes severe binlog event parsing bugs that prevented MySQL replication from working correctly in v1.3.0 and v1.3.1. These bugs caused TABLE_MAP_EVENT and ROWS_EVENT parsing failures, leading to complete replication breakdown.
Severity: Critical - Replication Failure
Problem: BinlogEventParser was skipping the MySQL protocol OK byte (0x00) at the beginning of each binlog event buffer, but BinlogReader had already skipped it. This caused a 1-byte offset error in all event parsing.
Impact:
- All binlog events parsed with incorrect offset
- TABLE_MAP_EVENT parsing completely broken (database/table names read from wrong positions)
- ROWS_EVENT parsing failures (wrong event type detection, wrong field positions)
- Complete replication breakdown
- Silent failures (events skipped, no error messages in some cases)
Root Cause:
The MySQL C API mysql_binlog_fetch() returns a buffer with the following format:
[OK byte (0x00)][binlog event data...]Both BinlogReader and BinlogEventParser were skipping the OK byte:
BinlogReaderat line 598:const unsigned char* event_buffer = rpl.buffer + 1;BinlogEventParserat line 156:buffer++; length--;
This caused all parsers to read from position +1 byte from the correct location.
Fix:
- Remove duplicate OK byte skip from
BinlogEventParser::ParseBinlogEvent()(lines 156-157) - Document that
binlog_readeralready skips OK byte, parsers receive clean event data - Update all event parsers to use buffer directly without additional offset
- Add MySQL source code references (mysql-8.4.7/sql-common/client.cc, mysql-8.4.7/libs/mysql/binlog/event/binlog_event.h)
Files Changed:
src/mysql/binlog_event_parser.cpp: Remove duplicate OK byte skip, add documentation (lines 153-163)src/mysql/binlog_reader.cpp: Document OK byte handling (lines 601-608)src/mysql/rows_parser.cpp: Update boundary calculations with proper documentation (lines 352-365)
Severity: Critical - Buffer Overrun and Parsing Failure
Problem: Event parsers (ParseWriteRowsEvent, ParseUpdateRowsEvent, ParseDeleteRowsEvent) were not excluding the 4-byte checksum at the end of each binlog event when calculating parsing boundaries.
Impact:
- Buffer overrun when parsing the last 4 bytes of row events
- Parsing failures when attempting to read data from checksum area
- Data corruption risk from reading invalid data
- UPDATE_ROWS_EVENT particularly affected (before/after image parsing)
Root Cause:
MySQL binlog events have the following structure:
[event header (19 bytes)][event data][checksum (4 bytes)]Even when checksums are disabled via SET @source_binlog_checksum='NONE', MySQL still allocates 4 bytes at the end for checksum space (see BINLOG_CHECKSUM_LEN in mysql-8.4.7/libs/mysql/binlog/event/binlog_event.h).
The event parsers were using buffer + length or buffer + event_size as the end boundary, which included the checksum area.
Fix:
- Extract event_size from binlog header bytes [9-12] (little-endian)
- Calculate end boundary as
buffer + event_size - 4(excludeBINLOG_CHECKSUM_LEN) - Apply to all ROWS_EVENT parsers:
ParseWriteRowsEvent()(lines 352-365)ParseUpdateRowsEvent()(lines 570-578)ParseDeleteRowsEvent()(lines 972-980)
- Add MySQL source code references and detailed comments explaining checksum handling
Files Changed:
src/mysql/rows_parser.cpp: Fix boundary calculations in all ROWS_EVENT parsers
Severity: Critical - MySQL 8.0 Replication Failure
Problem: MySQL 8.0 ROWS_EVENT_V2 includes extra_row_info with a packed integer length field. The code misinterpreted the length as data-only, but MySQL's format includes the packed integer itself in the total length.
Impact:
- Pointer position offset errors in MySQL 8.0 row event parsing
- Parsing failures for all INSERT/UPDATE/DELETE operations
- Complete replication breakdown on MySQL 8.0
- Silent failures (events appear but are not parsed)
Root Cause:
MySQL 8.0 ROWS_EVENT_V2 format (when flags & 0x0001):
[extra_row_info_len (packed int)][extra_row_info_data]The extra_row_info_len value is the TOTAL length including the packed integer itself. For example, if the packed integer is 1 byte with value 2, then:
- Total length = 2 bytes
- Packed integer = 1 byte
- Actual data = 1 byte
The code was skipping ptr += extra_info_len, which double-counted the packed integer length.
Fix:
- Read packed integer and calculate bytes consumed:
auto len_bytes = static_cast<int>(ptr - ptr_before); - Skip only the remaining data:
ptr += (extra_info_len - len_bytes); - Add boundary validation:
if (skip_bytes < 0 || ptr + skip_bytes > end) { error } - Apply to all ROWS_EVENT parsers:
ParseWriteRowsEvent()(lines 384-404)ParseUpdateRowsEvent()(lines 629-649)ParseDeleteRowsEvent()(lines 1005-1025)
- Add MySQL source code references (mysql-8.4.7/libs/mysql/binlog/event/rows_event.h)
Files Changed:
src/mysql/rows_parser.cpp: Fix extra_row_info length calculation in all ROWS_EVENT parsers
Severity: High - Operational Visibility
Problem: When the requested GTID position has been purged from MySQL binlogs (errno 1236), the system would retry indefinitely without clear indication that manual intervention (SYNC command) is required.
Impact:
- Infinite retry loops consuming resources
- No clear guidance for operators on required action
- Delayed incident response
- Confusion about replication state
Root Cause:
BinlogReader treated errno 1236 (ER_MASTER_FATAL_ERROR_READING_BINLOG) as a generic error and attempted to reconnect, even though this error means the GTID position is no longer available on the server.
Fix:
- Detect errno 1236 specifically in two locations:
mysql_binlog_open()failure (lines 410-423)mysql_binlog_fetch()failure (lines 551-568)
- Stop replication immediately (set
should_stop_ = true) - Log structured error with clear action message:
"Binlog position no longer available on server. GTID position has been purged. Manual intervention required: run SYNC command to establish new position."
- Prevent wasted reconnection attempts
Files Changed:
src/mysql/binlog_reader.cpp: Add errno 1236 detection and handling
Problem: TABLE_MAP_EVENT parsing failures were silent or produced unclear error messages.
Fix:
- Add detailed debug logs at each parsing step:
- Buffer length and remaining bytes after each field extraction
- Database name, table name, table_id values
- Field lengths (db_len, table_len) before boundary checks
- Error logs for all boundary validation failures
- Add structured error messages for each validation point
- Easier diagnosis of parsing failures
Files Changed:
src/mysql/binlog_event_parser.cpp: Add field-by-field debug logging (lines 506-587)
Problem: BinlogReader lifecycle events logged inconsistently, making it difficult to trace connection/replication state.
Fix:
- Convert info-level logs to structured logs using
StructuredLog():binlog_connection_init: Creating dedicated binlog connectionbinlog_connection_validated: Connection validation successfulbinlog_reader_started: Reader thread started with GTIDbinlog_reader_stopped: Reader stopped with event countbinlog_gtid_set: GTID position updatedbinlog_replication_start: Starting replication from GTIDbinlog_reconnected: Reconnection successfulbinlog_stream_opened: Binlog stream openedbinlog_connection_lost: Connection lost, will reconnectbinlog_error: Critical errors (binlog_purged, fetch failures)
- All structured logs include relevant context (GTID, error messages, errno)
- Better monitoring and alerting integration
Files Changed:
src/mysql/binlog_reader.cpp: Convert lifecycle logs to structured format
Problem: When replication appeared to start but no events were received, diagnosis was difficult.
Fix:
- Log first
mysql_binlog_fetch()result with details:spdlog::debug("First mysql_binlog_fetch returned: result={}, size={}, buffer={}", result, rpl.size, (void*)rpl.buffer); - Track and log when no data is returned repeatedly:
static int no_data_count = 0; if (no_data_count % 100 == 1) { spdlog::debug("Binlog fetch returned no data (count={}). This may indicate: 1) No new events on MySQL, 2) GTID position issue, 3) Network keepalive", no_data_count); } - Detect TABLE_MAP_EVENT parsing attempts and log results
- Easier diagnosis of replication issues
Files Changed:
src/mysql/binlog_reader.cpp: Add fetch diagnostics (lines 483-489, 586-594, 626-633)
Problem: SYNC operations logged inconsistently without structured context.
Fix:
- Add structured logs for SYNC lifecycle:
- SYNC start with configuration
- SYNC completion with statistics
- SYNC failure with error details
- Include context fields (table name, document count, duration)
- Better integration with monitoring systems
Files Changed:
src/server/sync_operation_manager.cpp: Add structured logs for SYNC operations
Problem: Production logs were too verbose with info-level logs for routine replication events.
Fix:
- Change routine replication logs to debug level:
- Checksums disabled: info → debug
- GTID set usage: info → debug
- Empty GTID set: info → debug
- Reconnect delays: info → debug
- Connection validation: info → debug
- Column name fetches: info → debug
- Individual INSERT/UPDATE/DELETE events: info → debug
- First few non-tracked table skips: remains at info level for awareness
- Keep important events at info level:
- Connection establishment
- Reader start/stop
- GTID changes
- Stream open
- Reconnection events
- Errors and warnings
Files Changed:
src/mysql/binlog_reader.cpp: Reduce log verbosity for production
Existing Tests Updated:
- All existing binlog parsing tests continue to pass
- Tests validated against MySQL 8.0 and MySQL 8.4 binlog formats
Test Files:
tests/mysql/binlog_parsing_test.cpp: Validates OK byte handling fixtests/mysql/rows_parser_test.cpp: Validates checksum boundary fix
- Files Changed: 6
- Insertions: +263 lines
- Deletions: -46 lines
- Net Change: +217 lines
| Module | Files | Description |
|---|---|---|
| MySQL | 4 | Binlog parsing fixes, structured logging |
| Server | 1 | SyncOperationManager structured logging |
| Tests | 2 | Validation of parsing fixes |
No configuration changes required. This is a transparent bug fix release.
If you are running v1.3.0 or v1.3.1 with MySQL replication, you likely experienced:
- TABLE_MAP_EVENT parsing failures (events for tables not being processed)
- ROWS_EVENT parsing failures (INSERT/UPDATE/DELETE events ignored)
- Complete replication breakdown (no data synchronization from MySQL)
- Silent failures (events appear in binlog but are not applied)
Docker users:
# Pull new image
docker pull ghcr.io/libraz/mygram-db:v1.3.2
# Update docker-compose.yml
services:
mygramdb:
image: ghcr.io/libraz/mygram-db:v1.3.2
# Restart container
docker-compose up -dRPM users:
# Download and install
sudo rpm -Uvh mygramdb-1.3.2-1.el9.x86_64.rpm
# Restart service
sudo systemctl restart mygramdbSource build:
git checkout v1.3.2
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --parallel
sudo systemctl restart mygramdb1. Verify replication is processing events:
echo "REPLICATION STATUS" | nc localhost 3307
# Should show: running
# events_processed should be increasing2. Check logs for successful TABLE_MAP parsing:
tail -f /var/log/mygramdb/mygramdb.log | grep "TABLE_MAP"
# Should show: "TABLE_MAP_EVENT parsed successfully: database.table"3. Verify data is being synchronized:
-- On MySQL, insert a test row
INSERT INTO your_table (id, content) VALUES (99999, 'test replication');
-- Wait a few seconds, then on MygramDB
echo "GET your_table 99999" | nc localhost 3307
# Should return the test row4. Check for parsing errors:
tail -f /var/log/mygramdb/mygramdb.log | grep "ERROR"
# Should not show binlog parsing errors5. Monitor binlog queue size:
echo "SHOW STATUS LIKE 'binlog_queue%'" | nc localhost 3307
# binlog_queue_size should remain low (not growing unbounded)If issues arise, rollback to v1.3.1 is possible but NOT RECOMMENDED due to critical replication bugs:
# Docker
docker pull ghcr.io/libraz/mygram-db:v1.3.1
# RPM
sudo rpm -Uvh --oldpackage mygramdb-1.3.1-1.el9.x86_64.rpmAll deployments using MySQL replication (v1.3.0 or v1.3.1) must upgrade to v1.3.2 immediately.
Affected Scenarios:
-
Replication Completely Broken - Critical
- Any deployment with MySQL replication enabled
- TABLE_MAP_EVENT parsing failures prevent table identification
- ROWS_EVENT parsing failures prevent data synchronization
- Risk: Complete replication breakdown, data not synchronized
-
Silent Replication Failures - Critical
- Events appear in binlog but are silently skipped
- No error messages in some failure cases
- Data inconsistency between MySQL and MygramDB
- Risk: Silent data loss, difficult to diagnose
-
MySQL 8.0 Compatibility - Critical
- MySQL 8.0 ROWS_EVENT_V2 parsing broken
- INSERT/UPDATE/DELETE operations not processed
- Risk: Complete replication failure on MySQL 8.0
-
Buffer Overrun Risk - High
- Checksum boundary errors can cause buffer overrun
- Data corruption risk from reading invalid memory
- Risk: Undefined behavior, potential crashes
Non-Affected Scenarios:
- Deployments NOT using MySQL replication (direct data loading only) are NOT affected
- Query/search functionality continues to work normally
- Existing data in MygramDB is not affected
- Reduced Log Volume: Debug-level logging for routine events reduces log I/O overhead
- Better Diagnostics: Structured logging enables faster issue diagnosis
All fixes are transparent improvements with no performance degradation.
- Full Changelog: v1.3.1...v1.3.2
- Docker Image: ghcr.io/libraz/mygram-db
- Configuration Reference: docs/en/configuration.md
- Replication Management Guide: docs/en/replication_management.md
Questions or Issues?
- GitHub Issues: https://github.com/libraz/mygram-db/issues
- Documentation:
docs/directory - Discussions: https://github.com/libraz/mygram-db/discussions
Recommended Version: v1.3.2 (for replication users), v1.3.1 (for non-replication users)
Release Tag: git tag -a v1.3.2 -m "MygramDB v1.3.2: Critical binlog parsing fixes for MySQL replication"