[BACKPORT 2024.2][#28166] CDC: Add a batch index to every single shard record

fourpointfour · fourpointfour · commit 61a604a46dd4 · 2025-08-21T08:26:12.000Z
Summary: **Backport description:** No merge conflicts were encountered. After generating unique record IDs, for comparing 2 records, we have checks to compare them using commit_time, record_time, write_id, etc to ascertain which record comes before in a sorted order. The check is performed in the method `CDCSDKUniqueRecordID::GreaterThanDistributedLSN`. Now when the GUC `yb_disable_transactional_writes` is set, we have multiple records inserted in a single `WRITE_OP` batch for a single shard transaction, there's a possibility that the records end up having the same `commit_time`, `record_time`, `write_id` and `table_id` so we fallback to comparing the primary keys. The core issue was that the VWAL expects each individual tablet to send records in order as per the order determined by `CDCSDKUniqueRecordID::GreaterThanDistributedLSN`. This was being violated and the situation leads us to a data loss scenario where we end up losing multiple records inserted in the same batch of a single shard transaction. The fix to this requires a mechanism to reliably have a fixed sorting order when all the other parameters end up with the same value. This PR adds the same mechanism by assigning a `write_id` to every single shard record based on the index of the wal records batch we are processing. By doing this, we'll be breaking the tie using `write_id` and we can reliably sort the records without filtering them. Jira: DB-17813 Original commit: c397d2f / D45857 Test Plan: The tests to reproduce the error and validate the fix has been added as a part of the logical replication connector's test suite in the following PR: yugabyte/debezium#182 Additionally, even though the issue will not be applicable to the gRPC connector, we are also adding a test prudently to gRPC connector as well: yugabyte/debezium-connector-yugabytedb#379 Reviewers: asrinivasan, sumukh.phalgaonkar, skumar Reviewed By: sumukh.phalgaonkar Subscribers: ycdcxcluster Differential Revision: https://phorge.dev.yugabyte.com/D46189
diff --git a/src/yb/cdc/cdcsdk_producer.cc b/src/yb/cdc/cdcsdk_producer.cc
@@ -1326,7 +1326,9 @@ Status PopulateCDCSDKWriteRecord(
   SchemaPackingStorage* schema_packing_storage = &schema_packing_storages->at(table_id);
   // TODO: This function and PopulateCDCSDKIntentRecord have a lot of code in common. They should
   // be refactored to use some common row-column iterator.
-  for (auto it = batch.write_pairs().cbegin(); it != batch.write_pairs().cend(); ++it) {
+  int record_batch_idx = 0;
+  for (auto it = batch.write_pairs().cbegin(); it != batch.write_pairs().cend();
+       ++it, ++record_batch_idx) {
     const yb::docdb::LWKeyValuePairPB& write_pair = *it;
     Slice key = write_pair.key();
     const auto key_size =
@@ -1442,7 +1444,7 @@ Status PopulateCDCSDKWriteRecord(
       row_message->set_table_id(table_id);
       row_message->set_primary_key(primary_key.ToBuffer());
       CDCSDKOpIdPB* cdc_sdk_op_id_pb = proto_record->mutable_cdc_sdk_op_id();
-      SetCDCSDKOpId(msg->id().term(), msg->id().index(), 0, "", cdc_sdk_op_id_pb);
+      SetCDCSDKOpId(msg->id().term(), msg->id().index(), record_batch_idx, "", cdc_sdk_op_id_pb);
       is_packed_row_record = false;
 
       // Check whether operation is WRITE or DELETE.