Skip to content

Conversation

@Enjection
Copy link
Owner

Problem:

  • Incremental backup supported indexes but restore did not
  • Index data was backed up but not restored

Implementation:

  1. Index Discovery and Restore:

    • Added DiscoverAndCreateIndexRestoreOperations() to scan __ydb_backup_meta/indexes
    • Implemented recursive traversal for nested table structures
    • Fixed path matching to avoid false positives (exact match or with '/' separator)
    • Creates restore operations for each index impl table found
  2. Schema Version Synchronization:

    • Relies on CDC sync functions already present in main
    • Main's SyncImplTableVersion and SyncIndexEntityVersion handle version sync
    • No additional AlterTableIndex operations needed
  3. Testing:

    • Added 5 comprehensive test cases covering various scenarios
    • Tests verify both data integrity and index functionality
    • Direct index impl table verification for stronger assertions

Files Modified:

  • schemeshard_impl.h: Function declarations for index discovery
  • schemeshard_incremental_restore_scan.cpp: Index discovery and restore logic
  • datashard_ut_incremental_backup.cpp: Test cases

This ensures indexes are properly restored and remain queryable.### Changelog entry

...

Changelog category

  • New feature
  • Experimental feature
  • Improvement
  • Performance improvement
  • User Interface
  • Bugfix
  • Backward incompatible change
  • Documentation (changelog entry is not required)
  • Not for changelog (changelog entry is not required)

Description for reviewers

...

@coderabbitai
Copy link

coderabbitai bot commented Nov 17, 2025

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Fixed race condition in incremental backups with global indexes that previously caused SCHEME_CHANGED errors during restore operations.
    • Improved schema version synchronization for index implementation tables during backup and restore workflows.
    • Enhanced index handling to ensure consistent state across table, index, and implementation table metadata during incremental backup/restore cycles.
  • Tests

    • Added comprehensive test suite for incremental backup and restore scenarios with single and multiple indexes.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds CDC stream creation for main and index-impl tables during incremental backups, extends protobuf with index CDC metadata and OmitIndexes, implements version synchronization strategies (notably lock-free "helping"), introduces index-aware incremental restore and finalization phases, and adds extensive tests and design documentation.

Changes

Cohort / File(s) Summary
Design & Documentation
INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md, INCREMENTAL_BACKUP_INDEX_PROGRESS.md, VERSION_SYNC_PLAN.md, CDC_VERSION_SYNC_RESEARCH_SUMMARY.md, DATASHARD_VERSION_VALIDATION_ANALYSIS.md, cdc_version_sync_design.md, ss2.md, strategy_e_implementation_research.md
New and updated design/research/progress documents describing race root cause, multi-phase CDC backup strategy, version-sync approaches (A–E), Strategy E implementation plan, datashard validation, and testing/verification plans.
Protobuf Schema
ydb/core/protos/flat_scheme_op.proto
Added optional bool OmitIndexes = 47 to TTableDescription and map<string, TCreateCdcStream> IndexImplTableCdcStreams = 9 to TCopyTableConfig to carry per-index CDC config during copy operations.
Test Coverage
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp
Added comprehensive incremental backup/restore tests covering tables with global indexes, multiple indexes, indexImplTable verification, OmitIndexes scenarios, and dynamic incremental backup path discovery.
Test Infrastructure
ydb/core/tx/datashard/ut_common/datashard_ut_common.h
Added #include <ydb/library/ut/ut.h> and extended ExecSQL signature to accept optional TTestContext testCtx for enhanced test diagnostics.
CDC Stream Backup Creation
ydb/core/tx/schemeshard/schemeshard__operation_backup_backup_collection.cpp
Create CDC streams for main tables and index implementation tables during incremental backups, persist stream metadata into copy descriptors, refresh caches, and compute per-index PQ boundaries for part creation.
CDC Version Synchronization API
ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp, ydb/core/tx/schemeshard/schemeshard_cdc_stream_common.h
Added SyncIndexEntityVersion and SyncChildIndexes APIs and introduced a lock-free helping flow (HelpSyncSiblingVersions) to converge versions among parent table, index entity, and impl tables.
Consistent Copy Tables
ydb/core/tx/schemeshard/schemeshard__operation_consistent_copy_tables.cpp
Set OmitIndexes=true for delegated index processing, preserve and propagate per-index CDC stream metadata into index impl table copy descriptors, and add detailed logging for table/index copy decisions.
Copy Table Operation
ydb/core/tx/schemeshard/schemeshard__operation_copy_table.cpp
Honor OmitIndexes to skip impl-table copies in incremental CDC paths, persist AlterVersion using source table object, and call SyncChildIndexes for CDC-backed copies; relax some source-path checks when allowed under same operation.
Incremental Restore Scan & Index Discovery
ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp, ydb/core/tx/schemeshard/schemeshard_impl.h
Added index-aware restore discovery: DiscoverAndCreateIndexRestoreOperations, DiscoverIndexesRecursive, CreateSingleIndexRestoreOperation, and FindTargetTablePath to parallelize index restore ops and map backup-relative paths to targets.
Incremental Restore Finalization
ydb/core/tx/schemeshard/schemeshard__operation_incremental_restore_finalize.cpp
Introduced TConfigureParts sub-state and ConfigureParts phase to collect indexImplTables, bump and persist AlterVersions, and added SyncIndexSchemaVersions to align parent index versions during finalization.
Restore Path Validation
ydb/core/tx/schemeshard/schemeshard__operation_create_restore_incremental_backup.cpp
Conditional skip of IsCommonSensePath() assertion for destination paths inside table index paths to allow restores into index-backed destinations while keeping NotUnderDeleting() checks.

Sequence Diagram(s)

sequenceDiagram
    participant SchemeShard
    participant BackupOp as BackupCollection<br/>Operation
    participant CDC as CDC Stream<br/>Creation
    participant CopyOp as Copy Table<br/>Operation
    participant PQ as PQ Parts<br/>Creation

    BackupOp->>CDC: CreateCdcStream(main table)
    CDC-->>BackupOp: CDC stream for main table
    BackupOp->>CDC: CreateCdcStream(index impl tables)
    CDC-->>BackupOp: CDC streams for each index impl table
    BackupOp->>BackupOp: Persist CDC metadata to descriptor
    BackupOp->>CopyOp: CopyTableTask with OmitIndexes=true
    CopyOp->>CopyOp: Set CDC stream on main table copy
    CopyOp->>CopyOp: For each index impl table,<br/>set CDC stream from descriptor
    CopyOp-->>BackupOp: Copy operations configured
    BackupOp->>PQ: CreatePqParts(main CDC stream)
    BackupOp->>PQ: CreatePqParts(per-index CDC streams)
    PQ-->>BackupOp: PQ parts created
Loading
sequenceDiagram
    participant DataShard
    participant SchemeShard
    participant CdcOp as CDC Propose<br/>At Table
    participant IndexParent as Parent Index<br/>Entity
    participant ImplTable as Index Impl<br/>Table

    CdcOp->>DataShard: Propose CDC creation
    DataShard->>DataShard: Increment AlterVersion (self)
    DataShard->>SchemeShard: TEvProposeTransactionResult
    SchemeShard->>CdcOp: HandleReply
    CdcOp->>CdcOp: HelpSyncSiblingVersions(myVersion)
    CdcOp->>CdcOp: Collect related: index entity + impl table
    CdcOp->>CdcOp: Compute max version across all three
    CdcOp->>IndexParent: Update if behind
    CdcOp->>ImplTable: Update if behind
    CdcOp->>SchemeShard: Persist all changes in single txn
    CdcOp->>SchemeShard: Publish to SchemeBoard
Loading
sequenceDiagram
    participant RestoreOp as Incremental<br/>Restore Scan
    participant IndexDiscover as Index<br/>Discovery
    participant TableRestore as Table Restore<br/>Operation
    participant IndexRestore as Index Restore<br/>Operation
    participant Finalize as Incremental<br/>Restore Finalize

    RestoreOp->>RestoreOp: CollectTargetTablePaths
    RestoreOp->>RestoreOp: Also collect index impl tables
    RestoreOp->>IndexDiscover: DiscoverAndCreateIndexRestoreOperations
    IndexDiscover->>IndexDiscover: DiscoverIndexesRecursive(backup path)
    IndexDiscover->>IndexDiscover: Find target table mapping
    IndexDiscover->>IndexRestore: CreateSingleIndexRestoreOperation(per index)
    IndexRestore-->>RestoreOp: Index restore ops queued
    RestoreOp->>TableRestore: Create table restore ops (parallel)
    TableRestore-->>RestoreOp: Table restore ops queued
    RestoreOp->>Finalize: Coordinate finalization
    Finalize->>Finalize: TConfigureParts phase:<br/>collect index impl tables
    Finalize->>Finalize: Increment index AlterVersions
    Finalize->>Finalize: SyncIndexSchemaVersions
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Areas needing focused review:
    • schemeshard__operation_backup_backup_collection.cpp — CDC stream creation and metadata propagation to copy descriptors.
    • schemeshard__operation_common_cdc_stream.cpp & schemeshard_cdc_stream_common.h — Strategy E (HelpSyncSiblingVersions) correctness and idempotency.
    • schemeshard__operation_incremental_restore_finalize.cpp — TConfigureParts state transitions and SyncIndexSchemaVersions persistence.
    • schemeshard_incremental_restore_scan.cpp — Index discovery and mapping from backup paths to restore targets.
    • Test suite additions — ensure tests exercise race conditions and version-sync paths.

Poem

🐰 I hopped through streams of CDC,
Found index impls and set them free,
Versions helped their siblings grow,
No more races on the go,
Backup, restore — a tidy spree!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Implement incremental restore for global indexes' clearly and specifically describes the main change: adding incremental restore support for indexes, which aligns with the PR's stated objective and implementation.
Description check ✅ Passed The description is related to the changeset and provides a clear problem statement, implementation approach, and list of modified files, though it omits details from the extensive documentation and CDC/schema version synchronization changes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/plan-backup-indexes-011CUsry2xyTV6Y17uNuNUZM

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (22)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (2)

219-236: FindIncrementalBackupDir helper is reasonable

The helper uses DescribePath with ReturnChildren=true and picks the first child ending with _incremental, which fits the test usage where only one incremental backup is expected per collection. Returning "" and asserting at call sites gives clear failures.

If you ever introduce scenarios with multiple incremental backups per collection, consider selecting the latest child (lexicographically max timestamp) instead of the first match.


511-630: SimpleBackupRestoreWithIndex provides strong end-to-end coverage

This test nicely validates:

  • Pre-backup index usability via VIEW idx.
  • Full backup & restore of a table with a global index.
  • Post-restore index behavior (query via view) and the index implementation table row count.

The additional version diagnostics using Ls are harmless and will help debugging if something regresses. The fixed sleeps (5s) are a bit heavy but acceptable in this suite.

ydb/core/tx/datashard/datashard_ut_common_kqp.h (1)

6-6: KqpSimpleExecSuccess helper and test context wiring

The new KqpSimpleExecSuccess wrapper plus #include <ydb/library/ut/ut.h> cleanly centralize “assert SUCCESS + format result” behavior for KQP tests. The TTestContext testCtx parameter is currently not passed into CTX_UNIT_ASSERT_VALUES_EQUAL_C, so this relies on TTestContext RAII semantics rather than an explicit context argument; if CTX_* macros actually expect a context parameter, please thread testCtx through, otherwise consider marking it explicitly unused to avoid warnings.

Also applies to: 186-192

ydb/core/tx/datashard/ut_common/datashard_ut_common.cpp (1)

2098-2104: ExecSQL now explicit about expected status and uses CTX_ assertions*

Extending ExecSQL with an explicit Ydb::StatusIds::StatusCode code and switching to CTX_UNIT_ASSERT_VALUES_EQUAL_C makes the expectation clearer and integrates with the new test‑context machinery. As with the KQP helper, TTestContext testCtx is not passed to the macro, so this depends on RAII semantics in ut.h; please confirm that this is intentional and that all ExecSQL call sites have been updated (or consider adding a convenience overload that defaults code to SUCCESS).

Also applies to: 2111-2113

ydb/core/protos/flat_scheme_op.proto (1)

349-351: Proto extensions for index skipping and per‑index CDC look consistent

Adding optional bool OmitIndexes = 47 [default = false]; to TTableDescription and map<string, TCreateCdcStream> IndexImplTableCdcStreams = 9; to TCopyTableConfig is wire‑compatible and matches the incremental backup design (skip impl copies and track CDC streams per index impl table). It would be helpful to keep the semantics of OmitIndexes consistent with the similarly named flags in backup collection configs and to clearly document when IndexImplTableCdcStreams is expected to be filled (e.g., only for incremental backups with CDC).

Also applies to: 1288-1291

ydb/core/tx/schemeshard/schemeshard_cdc_stream_common.h (1)

13-14: Make dependencies for new CDC state helpers explicit

SyncIndexEntityVersion / SyncChildIndexes look fine, but this header now depends on NIceDb::TNiceDb and TPathElement::TPtr without declaring them locally. To reduce reliance on transitive includes, consider either:

  • Adding a lightweight forward declaration for namespace NIceDb { class TNiceDb; } and struct TPathElement;, or
  • Including the minimal headers that define these types (e.g. schemeshard_path_element.h / the TNiceDb declaration header).

This keeps the CDC state helpers robust against include-order changes.

Also applies to: 35-54

INCREMENTAL_BACKUP_INDEX_PROGRESS.md (1)

73-97: Clarify what is implemented vs. exploratory in the progress log

This document is very detailed and useful, but sections like “Next Steps (To Be Implemented)” and “Phase 2 Analysis In Progress” now coexist with actual implementation work in this PR (e.g., index version synchronization on restore). It would help future readers if you:

  • Explicitly mark which parts of the “Proposed Solution: Version Synchronization” are implemented in code vs. still speculative, and
  • Optionally add a short “Current status” summary at the top, reflecting the behavior after this PR.

Also applies to: 98-102

ydb/core/tx/datashard/ut_common/datashard_ut_common.h (1)

15-16: ExecSQL test-context extension looks good

Adding ydb/library/ut/ut.h and extending ExecSQL with an optional TTestContext keeps existing callers working while enabling richer diagnostics in new tests. This is a sensible way to thread context into the helper.

If TTestContext ever becomes heavier to copy, you might switch the parameter to const TTestContext& testCtx = TTestContext() to avoid unnecessary copies, but for current unit-test usage this is purely optional.

Also applies to: 809-815

ydb/core/tx/schemeshard/schemeshard__operation_consistent_copy_tables.cpp (2)

171-213: Consider dialing back or gating verbose NOTICE logging in index copy path

The new LOG_NOTICE blocks around:

  • Table-level details (srcPath, dstPath, childrenCount, omitIndexes),
  • Table-index descriptions, and
  • Per-child decisions (deleted / sequence / non-index / index) and impl-table copies

are excellent for debugging incremental backup issues, but they will produce a lot of NOTICE-level chatter in environments that perform many consistent copy operations.

If this code path can be hot in production, consider either:

  • Lowering some of these to DEBUG level, or
  • Guarding the more verbose per-child logs behind a feature flag / dynamic log-component level.

Also applies to: 221-247, 263-267


268-284: Keying IndexImplTableCdcStreams only by impl-table leaf name may limit future flexibility

IndexImplTableCdcStreams is keyed by srcImplTableName (the impl table’s leaf name). This works today because:

  • Each index impl table under a given main table uses the same leaf name (e.g. indexImplTable), and
  • You currently configure identical CDC streams for all such impl tables.

However, if you ever need per-index CDC customization (different stream names or settings per index), this map will collapse multiple indexes to the same entry and you’ll lose the ability to distinguish them. In that case you’d likely want to key either by:

  • The full path under the main table (e.g. idxName/implTableName), or
  • The index path id.

No change is needed now, but it’s worth keeping in mind if the CDC configuration becomes index-specific.

ydb/core/tx/schemeshard/schemeshard__operation_backup_backup_collection.cpp (1)

211-259: Index impl table CDC diagnostics and PQ-part creation are consistent with main-table flow

The added diagnostics after CDC creation (logging AlterVersions for main tables, indexes, and index impl tables) and the PQ-part creation for index impl table CDC streams mirror the existing behavior for main tables:

  • StreamImpl is created up-front (main and index impl tables),
  • AtTable is driven via CreateConsistentCopyTables using the stored CDC descriptors, and
  • PQ parts are created afterwards using per-table partition boundaries.

This symmetry makes debugging schema-version issues around indexes much easier and seems logically correct.

One minor optional improvement would be to reuse the stored TCreateCdcStream from IndexImplTableCdcStreams when building PQ parts instead of reconstructing a fresh struct with the same fields, to keep all CDC configuration for a given table in one place.

Also applies to: 303-360

ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp (5)

238-261: Index impl tables added to finalize target list look functionally correct but stringly-typed

The new loop over Self->PathsById to add indexImplTable paths in CollectTargetTablePaths correctly filters by EPathStateIncomingIncrementalRestore and by parent table prefix, so it should only pull in index impl tables that are part of this restore. However, matching via pathString.Contains("/indexImplTable") and StartsWith(tablePath + "/") is fairly brittle: it relies on the naming convention and flat strings instead of the existing structural metadata (e.g., index path type + parent-child relationships).

If you want to harden this later, consider:

  • Using TPath/TPathElement to check that the path is indeed an index impl table (e.g., parent is EPathTypeTableIndex, child is table) instead of relying on the substring.
  • Optionally precomputing the relevant index impl table PathIds from the Indexes map or from the restore operations, then converting to strings, rather than scanning all PathsById for every finalize.

Functionally this is fine; the suggestion is about robustness and avoiding accidental matches if naming ever changes.


626-648: FindTargetTablePath suffix matching is simple but can be ambiguous in deeply nested layouts

FindTargetTablePath maps an accumulated relative path to a full table path using:

if (itemPath == relativeTablePath || itemPath.EndsWith("/" + relativeTablePath))

This works for the common case where relativeTablePath is the DB-relative suffix (e.g. dir/table). However, in pathological layouts with overlapping suffixes (e.g. /Root/A/B/table and /Root/C/A/B/table) this could pick the “wrong” table depending on entry order, since both end with /A/B/table.

If such layouts are possible, consider:

  • Comparing against the precomputed DB-relative part you already derive via TrySplitPathByDb instead of raw itemPath, or
  • Normalizing both sides into split components and comparing vectors rather than string suffixes.

If the schema guarantees uniqueness of these suffixes, this is fine; otherwise it’s a subtle corner case worth double-checking.


650-696: Recursive index discovery assumes metadata tree structure; add defensive checks if structure can vary

DiscoverIndexesRecursive walks the __ydb_backup_meta/indexes subtree and:

  • Treats accumulatedRelativePath as a table-relative path and calls FindTargetTablePath on it.
  • On match, assumes currentPath.Base()->GetChildren() are index names.
  • Otherwise, recurses into all children, extending accumulatedRelativePath.

This is correct if the metadata tree mirrors DB-relative table paths and the per-table node directly contains index nodes. If future metadata gains extra layers (e.g. other markers under the table node), CreateSingleIndexRestoreOperation might get invoked on non-index children.

If the layout isn’t strictly guaranteed, it would be safer to:

  • Filter children by expected type or naming before calling CreateSingleIndexRestoreOperation, or
  • Encode index metadata (e.g. a marker file node) and filter on that instead of blindly iterating all children.

Right now this is structurally sound but a bit optimistic about the metadata tree shape.


698-739: Index metadata discovery is tolerant, but lack of index meta is silently treated as “no indexes”

DiscoverAndCreateIndexRestoreOperations:

  • Respects OmitIndexes (good).
  • Resolves {bcPath}/{backupName}_incremental/__ydb_backup_meta/indexes and, if missing, logs and returns.

This means a mismatch between OmitIndexes=false and missing indexes dir is treated equivalently to “no indexes were backed up”. That’s likely fine, but if you want stricter observability you could:

  • Log at LOG_W when omitIndexes == false but indexMetaPath is unresolved, or
  • Optionally track a counter/flag in TIncrementalRestoreState that “indexes metadata missing” for diagnosing misconfigured backups.

Behavior-wise, this is safe; suggestion is only for diagnosability.


741-875: CreateSingleIndexRestoreOperation overall looks solid; only minor invariants worth calling out

The function correctly:

  • Validates the target table resolves and is a table.
  • Locates the index by name under the table, verifies it’s a TableIndex and EIndexTypeGlobal.
  • Enforces a single-child impl table, and gracefully skips unusual structures.
  • Builds a per-index ESchemeOpRestoreMultipleIncrementalBackups op, wires it into IncrementalRestoreOperationToState and TxIdToIncrementalRestore, and populates ExpectedShards/InvolvedShards from the impl table’s partitions.

Minor points to consider:

  1. Assumption that every GetChildren()-listed pathId is present in PathsById and Tables

    • PathsById.at(childPathId) and Tables.at(indexImplTablePathId) will Y_ABORT if metadata is temporarily inconsistent (e.g., index in the middle of drop/create).
    • If incremental restore can overlap with DDL on the same table/index, you may want FindPtr/contains checks and a warning-based early return instead of assuming perfect consistency.
  2. Handling missing table info for impl table

    • When Tables lacks indexImplTablePathId, you still create and send the restore request but never populate ExpectedShards for that op.
    • Depending on how AllShardsComplete() is implemented, this could either treat the op as “already complete” (0 expected shards) or never complete if responses arrive.
    • If such inconsistent states are possible, consider bailing out before creating the op when Tables has no entry for the impl table, or at least logging at LOG_W and not adding the op to InProgressOperations.
  3. Source path construction assumes {indexes}/{relativeTablePath}/{indexName}

    • This matches the new discovery code; if metadata layout ever changes, both must be updated in lockstep.

Functionally, this looks correct for the intended invariants; the above is mostly about making failure modes clearer and avoiding hard aborts.

ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp (3)

115-173: SyncImplTableVersion now persists and publishes versions; check for double-persist and invariants

The extended SyncImplTableVersion:

  • Derives targetVersion as max(parentTable.AlterVersion, index.AlterVersion) (when index exists).
  • Adjusts the impl table’s AlterVersion to targetVersion (or +1 if already ahead).
  • Persists the new version via PersistTableAlterVersion and publishes to SchemeBoard.

This is appropriate when invoked from UpdateTableVersion after an actual schema operation (e.g., CDC-related alter) has already gone through datashards.

Two things to double-check:

  1. Double persistence with TProposeAtTable::HandleReply

    • After UpdateTableVersion returns, HandleReply still calls PersistTableAlterVersion(db, pathId, table) unconditionally.
    • For impl tables, that’s harmless but redundant. If PersistTableAlterVersion does any heavier work (indexes, counters), you might consider guarding the second call or documenting the intentional duplication.
  2. Impl table ahead of parent

    • In the currentImplVersion > targetVersion branch, you increment impl version again, so impl stays ahead of parent/index.
    • That maintains monotonicity but violates “impl version >= parent/index version” if you wanted strict alignment.
    • If strict alignment is a requirement, you may want to set table->AlterVersion = currentImplVersion and then sync index entity to that, instead of bumping further.

Semantically this looks safe, but verifying expectations around parent vs impl version ordering would be good.


175-248: UpdateTableVersion logic is sound but may call SyncChildIndexes twice for continuous backup streams

UpdateTableVersion now distinguishes two cases:

  • Index impl table that is part of a continuous backup: calls SyncImplTableVersion, then SyncIndexEntityVersion, then SyncChildIndexes on the grandparent (main table).
  • All other cases: increments the table’s AlterVersion, and if the table has a continuous backup stream (HasParentContinuousBackup(versionCtx.PathId, context)), calls SyncChildIndexes to align child indexes.

This is consistent with the intent: keep impl tables, index entities, and sibling indexes in sync when continuous backup is active.

However, in TProposeAtTable::HandleReply you also have:

if (versionCtx.IsContinuousBackupStream && !versionCtx.IsIndexImplTable) {
    NCdcStreamState::SyncChildIndexes(path, table->AlterVersion, OperationId, context, db);
}

For operations on a continuous backup stream:

  • UpdateTableVersion’s “main table with continuous backup” path will call SyncChildIndexes once.
  • The IsContinuousBackupStream branch above will call it a second time with the same targetVersion.

Because SyncIndexEntityVersion is monotonic (if (targetVersion > oldIndexVersion)), the second pass is effectively a no-op but still iterates all children and logs.

If you want to avoid the duplicate work, you can either:

  • Drop the HasParentContinuousBackup branch for the “main table with continuous backup” inside UpdateTableVersion and rely on the explicit IsContinuousBackupStream branch, or
  • Add a flag to the TTableVersionContext indicating that child indexes were already synced during UpdateTableVersion.

Functionally correct as is; this is a small efficiency/clarity nit.


318-368: SyncChildIndexes correctly avoids impl-table bumps; assumes parentPath has all child indexes loaded

The new SyncChildIndexes:

  • Iterates parentPath->GetChildren(), filters to IsTableIndex() and not Dropped().
  • Uses SyncIndexEntityVersion to bump index entity versions to targetVersion.
  • Explicitly does not touch index impl table versions (with a clear NOTE), to avoid SCHEME_CHANGED errors from bumping versions without a datashard schema TX.
  • Logs entry/exit and per-index processing.

This matches the desired semantics: keep index entities in step with their parent’s AlterVersion, while leaving impl tables to be synchronized by actual schema operations (SyncImplTableVersion/alter flows).

Only assumption is that parentPath->GetChildren() and PathsById[childPathId] are always consistent. If there’s any chance of transient inconsistency (e.g. partially dropped index), consider guarding the PathsById.at(childPathId) with a contains check to avoid hard aborts.

ydb/core/tx/schemeshard/schemeshard__operation_incremental_restore_finalize.cpp (3)

40-173: TConfigureParts correctly prepares per-shard ALTERs for index impl tables; minor robustness nits

TConfigureParts::ProgressState:

  • Collects target index impl tables via CollectIndexImplTables(finalize, ...).
  • If none are found, logs and returns true, making ConfigureParts a no-op.
  • For each impl table, ensures AlterData exists (creating minimal TAlterTableInfo when necessary) and bumps AlterData->AlterVersion by 1.
  • Adds all partitions’ shard indices to txState->Shards as ConfigureParts operations and persists txState.
  • For each shard, finds the owning impl table, builds an ALTER TABLE body, and proposes it to the corresponding datashard, then updates ShardsInProgress.

Overall this is a solid pattern and mirrors other “ConfigureParts” flows in Schemeshard.

Minor improvements to consider:

  1. Shard-to-table mapping loop

    • For every shard you linearly scan all implTablesToUpdate and each table’s partitions to find tablePathId. For many tables/shards this is O(N²).
    • If this state can be large, consider precomputing a THashMap<TShardIdx, TPathId> once before the shard loop.
  2. Table existence checks

    • You correctly guard Tables.contains(tablePathId) and log if missing, but later in the shard loop you call context.SS->Tables.at(pathId) without re-check.
    • Today implTablesToUpdate only contains IDs that passed the earlier Tables.contains check, but if tables can disappear between these stages, consider either:
      • Building and using a local map of implTablePathId -> TTableInfo::TPtr while collecting, or
      • Re-guarding with contains in the shard loop to avoid Y_ABORT.

Functionally, this should work fine for typical sizes; the suggestions are about scaling and robustness.


175-203: CollectIndexImplTables shares assumptions with scan-side collection; looks consistent

CollectIndexImplTables:

  • Filters finalize.GetTargetTablePaths() by Contains("/indexImplTable").
  • Resolves the path, checks it’s a Table and present in Tables, and then inserts its PathId into implTables.

This matches how CollectTargetTablePaths in the scan code constructs target paths for index impl tables, so the two sides are consistent. If the naming convention for impl tables ever changes, both places will need to be updated together; you might eventually want a shared helper for “is index impl table path” to avoid divergence.


271-338: SyncIndexSchemaVersions finalizes impl-table alters and bumps index entities; consider reusing CDC helpers

SyncIndexSchemaVersions:

  • Iterates finalize.TargetTablePaths, filtering to paths containing /indexImplTable.
  • For each resolved impl table with non-null AlterData:
    • Calls FinishAlter, persists via PersistTableAltered, clears caches, and publishes.
    • Locates the parent index path and unconditionally ++s its AlterVersion, then persists and publishes.

The flow ensures that:

  • Impl tables, for which ConfigureParts prepared AlterData->AlterVersion = old + 1, actually commit that version bump.
  • Corresponding index entities also advance their AlterVersion, keeping metadata in sync.

Two points worth a closer look:

  1. Unconditional index AlterVersion increment

    • This always does oldVersion + 1 regardless of the impl-table’s new version.
    • Elsewhere (NCdcStreamState::SyncIndexEntityVersion), you sync to a target version and avoid regressions.
    • If other operations may have bumped the index version between ConfigureParts and Finalize, you might end up with index AlterVersion != table AlterVersion.
    • To keep semantics consistent, consider using SyncIndexEntityVersion(indexPathId, table->AlterVersion, OperationId, context, db) here instead of manually incrementing.
  2. Missing AlterData on impl table

    • If for some reason an impl table reaches finalization without AlterData, you log a warning and skip it.
    • That leaves its schema version unbumped while its index entity might still get bumped (if handled elsewhere).
    • If that’s an unexpected state, you may want to either:
      • Treat it as a hard error (or at least warn at a higher log level), or
      • Recompute a minimal AlterData similar to TConfigureParts instead of skipping.

No blockers here; the behavior is consistent with the new ConfigureParts stage, but reusing the CDC sync helpers would tighten invariants and reuse the “no downgrade” logic.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd5e49e and 010417d.

📒 Files selected for processing (19)
  • INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md (1 hunks)
  • INCREMENTAL_BACKUP_INDEX_PROGRESS.md (1 hunks)
  • VERSION_SYNC_PLAN.md (1 hunks)
  • ydb/core/protos/flat_scheme_op.proto (2 hunks)
  • ydb/core/tx/datashard/datashard_ut_common_kqp.h (2 hunks)
  • ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (3 hunks)
  • ydb/core/tx/datashard/ut_common/datashard_ut_common.cpp (1 hunks)
  • ydb/core/tx/datashard/ut_common/datashard_ut_common.h (2 hunks)
  • ydb/core/tx/schemeshard/schemeshard__operation_backup_backup_collection.cpp (4 hunks)
  • ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp (7 hunks)
  • ydb/core/tx/schemeshard/schemeshard__operation_consistent_copy_tables.cpp (4 hunks)
  • ydb/core/tx/schemeshard/schemeshard__operation_copy_table.cpp (4 hunks)
  • ydb/core/tx/schemeshard/schemeshard__operation_create_restore_incremental_backup.cpp (2 hunks)
  • ydb/core/tx/schemeshard/schemeshard__operation_incremental_restore_finalize.cpp (5 hunks)
  • ydb/core/tx/schemeshard/schemeshard_cdc_stream_common.h (2 hunks)
  • ydb/core/tx/schemeshard/schemeshard_impl.h (1 hunks)
  • ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp (2 hunks)
  • ydb/core/util/source_location.h (1 hunks)
  • ydb/library/ut/ut.h (1 hunks)
🔇 Additional comments (24)
ydb/core/util/source_location.h (1)

34-36: Dummy function_name() looks correct

The fallback TSourceLocation::function_name() mirrors the std::source_location API and unblocks callers like FormatLocation() without changing behavior (empty string in dummy mode). No issues here.

VERSION_SYNC_PLAN.md (1)

1-288: Helpful internal design/diagnostic plan

This document clearly captures the schema-version mismatch investigation and possible fixes; since it’s non-code, it’s safe to land as-is.

INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md (1)

1-632: Design summary is thorough and non-intrusive

The incremental-backup/index summary is detailed and aligns with the surrounding code, while remaining purely informational. Safe to keep as project documentation.

ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (6)

3091-3161: Improved OmitIndexesIncrementalBackup assertions look good

Switching to FindIncrementalBackupDir plus DescribePath:

  • Ensures the incremental backup table actually exists and has __ydb_incrBackupImpl_changeMetadata.
  • Explicitly verifies that __ydb_backup_meta is absent when OMIT_INDEXES = 'true'.

This makes the test much more robust against layout changes and aligns with the intended semantics of OMIT_INDEXES.


3162-3274: BasicIndexIncrementalRestore correctly validates single-index restore

The test covers:

  • Table with one global index.
  • Full backup + incremental backup (insert + update).
  • Data equality after restore and correct index query behavior.
  • Direct verification of the index implementation table contents and row count.

The expectations (4 final index rows, including updated value) match the described operations. Looks solid.


3276-3395: MultipleIndexesIncrementalRestore gives good multi-index coverage

This scenario exercises three global indexes on one table, with an incremental backup adding a new row. After restore, it verifies:

  • Base table data.
  • Each index via VIEW queries.
  • Each index impl table’s row count and sample contents.

The checks line up with the mutation pattern (4 rows total) and should catch regressions in multi-index restore logic.


3397-3509: IndexDataVerificationIncrementalRestore thoroughly checks index state

This test is valuable because it:

  • Uses a multi-shard table with a global index.
  • Applies mixed updates, inserts, and deletes before the incremental backup.
  • Compares index query results before/after restore and inspects the index impl table contents and row count.

The asserted presence/absence of specific rows (Alice/Charlie vs Eve/Bob/David/Frank) matches the sequence of operations and should strongly guard index correctness across shards.


3511-3632: MultipleIncrementalBackupsWithIndexes validates multi-step history

This test exercises three incremental backups over a table with a global index, mixing inserts, updates, and deletes, then:

  • Compares final table and index view results before/after restore.
  • Asserts that obsolete index entries (100, 200) are gone and only the final (250, 300, 400) remain.
  • Checks index impl row count.

The expectations align with the described mutations and will catch regressions in replaying multiple incremental backups with indexes.


3634-3765: MultipleTablesWithIndexesIncrementalRestore covers multi-table/index restore

For two tables each with their own global index, this test:

  • Runs full + incremental backups.
  • Restores and verifies both tables’ data.
  • Validates index queries on each table.
  • Checks that both index impl tables have the expected row counts and sample data.

This is a good cross-table coverage addition and matches the mutation pattern (3 rows per table).

ydb/library/ut/ut.h (2)

8-38: Test context abstraction is straightforward

NUnitTest::TTestContextBase and TTestContext cleanly wrap source-location into a format string, and now rely on TSourceLocation::function_name() which is available in both real and dummy implementations. No issues here.


41-91: Fix concepts usage and testCtx detection in CTX macros

Two related issues in CTX_UNIT_FAIL_IMPL:

  1. Missing <concepts> include

    The macro uses std::derived_from and std::convertible_to, which are declared in <concepts>. Relying on transitive includes is brittle and may fail in some TUs.

  2. testCtx referenced outside a requires expression

    The condition:

    if constexpr (requires { testCtx; } &&
                  std::derived_from<decltype(testCtx), ::NUnitTest::TTestContextBase> &&
                  requires { { testCtx.Format() } -> std::convertible_to<TString>; })

    evaluates std::derived_from<decltype(testCtx), ...> in a normal constant-expression context. In any use-site where testCtx is not declared, decltype(testCtx) is ill-formed and the macro will fail to compile, instead of cleanly falling back to __LOCATION__/__PRETTY_FUNCTION__ as intended.

    You can fix both by:

    • Including <concepts>.
    • Moving all uses of testCtx into a single requires-expression so the check is SFINAE-friendly.

A concrete fix could look like:

-#include <library/cpp/testing/unittest/registar.h>
-
-#include <ydb/core/util/source_location.h>
+#include <library/cpp/testing/unittest/registar.h>
+#include <ydb/core/util/source_location.h>
+#include <concepts>

@@
-#define CTX_UNIT_FAIL_IMPL(R, M)                                                                    \
-    do {                                                                                            \
-        ::TStringBuilder locationInfo;                                                              \
-        if constexpr (requires { testCtx; } &&                                                      \
-                      std::derived_from<decltype(testCtx), ::NUnitTest::TTestContextBase> &&        \
-                      requires { { testCtx.Format() } -> std::convertible_to<TString>; })           \
-        {                                                                                           \
-            locationInfo << testCtx.Format();                                                       \
-        } else {                                                                                    \
-            locationInfo << __LOCATION__ << ", " << __PRETTY_FUNCTION__;                            \
-        }                                                                                           \
-        ::NUnitTest::NPrivate::RaiseError(R, ::TStringBuilder() << R << " at " << locationInfo << ": " << M, true); \
-    } while (false)
+#define CTX_UNIT_FAIL_IMPL(R, M)                                                                    \
+    do {                                                                                            \
+        ::TStringBuilder locationInfo;                                                              \
+        if constexpr (requires {                                                                    \
+                              testCtx;                                                              \
+                              requires std::derived_from<decltype(testCtx), ::NUnitTest::TTestContextBase>; \
+                              { testCtx.Format() } -> std::convertible_to<TString>;                 \
+                          })                                                                        \
+        {                                                                                           \
+            locationInfo << testCtx.Format();                                                       \
+        } else {                                                                                    \
+            locationInfo << __LOCATION__ << ", " << __PRETTY_FUNCTION__;                            \
+        }                                                                                           \
+        ::NUnitTest::NPrivate::RaiseError(R, ::TStringBuilder() << R << " at " << locationInfo << ": " << M, true); \
+    } while (false)

This keeps the behavior the same when a compatible testCtx is in scope, but ensures that in call sites without testCtx the if constexpr condition simply evaluates to false and the macro falls back to the legacy location info. Please also verify that your toolchain is built with C++20 concepts enabled so <concepts> and these constraints are available.

ydb/core/tx/schemeshard/schemeshard__operation_create_restore_incremental_backup.cpp (1)

420-426: Allow restore into index impl/private tables while keeping normal path hygiene

Conditionally skipping IsCommonSensePath() when dstTablePath.IsInsideTableIndexPath(false) looks correct for enabling restores into index implementation tables while preserving the existing validation for regular user paths (NotEmpty, IsTable, NotUnderDeleting, etc.). Please double‑check that only internal incremental‑restore flows can target these private paths and that tests cover both cases: destination as a regular table and as an index impl table (created both inside and outside the same op).

Also applies to: 566-572

ydb/core/tx/schemeshard/schemeshard__operation_copy_table.cpp (3)

227-231: Persisting source AlterVersion and syncing child indexes for CDC‑backed copies

Bumping srcTable->AlterVersion, persisting it via PersistTableAlterVersion, and then calling NCdcStreamState::SyncChildIndexes when CdcPathId is set ensures that a CDC‑initiated copy keeps the source table and its child indexes in version sync. This looks correct and nicely centralizes the “update all index entities” step; just confirm that SyncChildIndexes is safe (no‑op) for tables without child indexes and that it uses the same db transaction to avoid partial version updates.


399-405: Relaxing source path “same‑operation” checks via AllowUnderSameOperation

Conditioning NotUnderTheSameOperation/NotUnderOperation on Transaction.GetCreateTable().GetAllowUnderSameOperation() allows internal multi‑step flows (like incremental backup/restore or consistent copying) to reuse TCopyTable even when the source path is already involved in the same top‑level tx, while still enforcing the full safety checks for normal users. This is reasonable, but please make sure that all callers that rely on this relaxation explicitly set AllowUnderSameOperation, and that there are no external/user‑visible code paths where a table is both being altered and copied in the same operation without a well‑defined snapshot.


808-810: Propagating OmitIndexes and skipping impl table copy sub‑operations

Passing copying.GetOmitIndexes() into the nested CreateTable schema and then, in the index loop, always creating the index metadata but skipping impl table CreateTable/CopyTable sub‑operations when OmitIndexes is true matches the idea that index data will be handled separately (e.g., via CDC‑driven consistent copies). The structure of the loop remains the same for the OmitIndexes == false case, so existing CopyTable behavior is preserved; please verify that in the OmitIndexes == true flows there is a corresponding path that creates/restores index implementation tables before any reads rely on them.

Also applies to: 866-869

ydb/core/tx/schemeshard/schemeshard_impl.h (1)

1198-1226: Index-restore helper declarations are consistent with existing incremental-restore API

The new helpers for index discovery and restore (DiscoverAndCreateIndexRestoreOperations, DiscoverIndexesRecursive, CreateSingleIndexRestoreOperation, FindTargetTablePath) fit well with the surrounding incremental-restore state and event handlers; signatures and ownership semantics look reasonable.

ydb/core/tx/schemeshard/schemeshard__operation_consistent_copy_tables.cpp (1)

23-35: Index handling split between descriptor and inner CreateTable looks correct

Setting operation->SetOmitIndexes(true) in CopyTableTask while still honoring descr.GetOmitIndexes() in CreateConsistentCopyTables is a good separation of concerns:

  • The descriptor’s OmitIndexes continues to control whether indexes are processed at all in the explicit child loop.
  • The inner CreateTable OmitIndexes flag prevents CreateCopyTable from re-walking children and duplicating index work, which is now handled explicitly (including CDC wiring).

This matches the new CDC-aware index flow and shouldn’t affect callers outside CreateConsistentCopyTables, since CopyTableTask is local to this TU.

Also applies to: 134-170

ydb/core/tx/schemeshard/schemeshard__operation_backup_backup_collection.cpp (1)

66-73: Effective omitIndexes flag and main-table CDC setup look correct

  • omitIndexes combining collection-level OmitIndexes with incremental-config OmitIndexes is a sensible “effective” flag for this backup run.
  • Wiring this into desc.SetOmitIndexes(omitIndexes) lets CreateConsistentCopyTables decide whether to process indexes at all, while CopyTableTask ensures CreateTable itself doesn’t recurse into them.
  • For incremental backups, creating a StreamImpl CDC stream on the main table once, and then embedding CreateSrcCdcStream into the descriptor so that the copy phase can perform the AtTable step, matches the intended three-phase CDC strategy.

Also applies to: 87-96, 100-132

ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp (1)

612-621: Parallel index restore ops are wired into the global incremental restore state as expected

The call to DiscoverAndCreateIndexRestoreOperations from CreateIncrementalRestoreOperation uses the same operationId, backupName, and bcPath as table restores and wires index operations into IncrementalRestoreStates via IncrementalRestoreOperationToState/TxIdToIncrementalRestore in CreateSingleIndexRestoreOperation. That keeps index impl restores tracked uniformly with table restores and allows shard progress/finalization to work without special cases.

No correctness issues spotted in this integration.

ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp (2)

253-316: SyncIndexEntityVersion wrapper looks correct; monotonic update prevents version regressions

The new public-style SyncIndexEntityVersion:

  • Checks Indexes.contains(indexPathId) and early-returns if the index doesn’t exist.
  • Only updates AlterVersion if targetVersion > oldIndexVersion, preventing downgrades during races.
  • Persists via PersistTableIndexAlterVersion, clears caches, and publishes to SchemeBoard with the supplied operationId.

This is the right behavior for callers like UpdateTableVersion and index-finalization logic that need to “catch up” index metadata to a table/impl version without risking rollback.

No issues spotted; just ensure the declaration in schemeshard_cdc_stream_common.h matches this signature exactly (including NIceDb::TNiceDb&).


466-471: Version update + SyncChildIndexes sequencing in TProposeAtTable::HandleReply

In TProposeAtTable::HandleReply you now:

  1. Build versionCtx and call UpdateTableVersion(versionCtx, table, OperationId, context, db).
  2. If versionCtx.IsContinuousBackupStream && !versionCtx.IsIndexImplTable, call SyncChildIndexes with the new table->AlterVersion.
  3. Persist table alter version, clear caches, publish, update tx state.

This ordering is sensible: the parent table’s AlterVersion is finalized before child index entities are synced, and all metadata is persisted and published afterwards.

Given that UpdateTableVersion already covers some continuous-backup paths (main table and impl tables), the explicit IsContinuousBackupStream check here primarily covers the CDC-stream-specific case. The only caveat is the potential double SyncChildIndexes discussed earlier; apart from that, the flow looks good.

ydb/core/tx/schemeshard/schemeshard__operation_incremental_restore_finalize.cpp (3)

13-38: State machine extension to ConfigureParts is coherent

Changing NextState to:

  • Waiting -> ConfigureParts
  • ConfigureParts -> Propose
  • Propose -> Done

and mapping both Waiting and ConfigureParts to TConfigureParts in SelectStateFunc makes sense—the first entry into the operation runs ConfigureParts, and replays from persisted Waiting state will also land there.

This nicely inserts an index-related ALTER stage before finalization without disturbing the rest of the lifecycle.


224-268: Calling SyncIndexSchemaVersions before releasing path states is the right place

In TFinalizationPropose::ProgressState, you now:

  1. Retrieve the finalize proto.
  2. Call SyncIndexSchemaVersions(finalize, context) to finalize impl-table alters and bump index entity versions.
  3. Only then release affected path states and perform cleanup.

This ordering ensures:

  • Data shards have applied the ALTER from the ConfigureParts stage.
  • Schemeshard’s in-memory and persisted metadata (impl tables + indexes) are consistent before paths transition out of EPathStateIncomingIncrementalRestore.

That sequencing looks correct and avoids exposing “half-bumped” schema versions to clients.


374-394: Final cleanup respects shard progress and state; no issues

PerformFinalCleanup:

  • Marks the incremental restore state as Completed (without immediately deleting the in-memory entry).
  • Erases the corresponding entry from LongIncrementalRestoreOps.
  • Calls CleanupMappings to remove all TxIdToIncrementalRestore and IncrementalRestoreOperationToState mappings for this operationId.

Combined with earlier deletion of IncrementalRestoreShardProgress and IncrementalRestoreState rows, this leaves no dangling bookkeeping for the restore while still allowing high-level operation listings to reflect completion as intended.

Looks consistent with the rest of the incremental restore machinery.

Comment on lines +133 to +181
// Create CDC StreamImpl for index implementation tables (before copying starts)
// Store CDC info in the descriptor so CreateConsistentCopyTables can create AtTable and PQ parts
// Only do this for incremental backups
if (incrBackupEnabled && !omitIndexes) {
const auto tablePath = sPath;

// Iterate through table's children to find indexes
for (const auto& [childName, childPathId] : tablePath.Base()->GetChildren()) {
auto childPath = context.SS->PathsById.at(childPathId);

// Skip non-index children (CDC streams, etc.)
if (childPath->PathType != NKikimrSchemeOp::EPathTypeTableIndex) {
continue;
}

// Skip deleted indexes
if (childPath->Dropped()) {
continue;
}

// Get index info and filter for global sync only
auto indexInfo = context.SS->Indexes.at(childPathId);
if (indexInfo->Type != NKikimrSchemeOp::EIndexTypeGlobal) {
continue;
}

// Get index implementation table (the only child of index)
auto indexPath = TPath::Init(childPathId, context.SS);
Y_ABORT_UNLESS(indexPath.Base()->GetChildren().size() == 1);
auto [implTableName, implTablePathId] = *indexPath.Base()->GetChildren().begin();

auto indexTablePath = indexPath.Child(implTableName);

// Create CDC stream on index impl table (before copying starts)
NKikimrSchemeOp::TCreateCdcStream indexCdcStreamOp;
// Set table name to just the name since we're passing the full path as tablePath parameter
indexCdcStreamOp.SetTableName(implTableName);
auto& indexStreamDescription = *indexCdcStreamOp.MutableStreamDescription();
indexStreamDescription.SetName(streamName);
indexStreamDescription.SetMode(NKikimrSchemeOp::ECdcStreamModeUpdate);
indexStreamDescription.SetFormat(NKikimrSchemeOp::ECdcStreamFormatProto);

NCdc::DoCreateStreamImpl(result, indexCdcStreamOp, opId, indexTablePath, false, false);

// Store CDC stream info in the descriptor's map
// Key is the impl table name, value is the CDC stream config
(*desc.MutableIndexImplTableCdcStreams())[implTableName].CopyFrom(indexCdcStreamOp);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix inverted condition when invalidating caches for index impl tables

In the “Also invalidate cache for index impl tables” block:

for (const auto& [childName, childPathId] : sPath.Base()->GetChildren()) {
    auto childPath = context.SS->PathsById.at(childPathId);
    if (childPath->PathType != NKikimrSchemeOp::EPathTypeTableIndex && !childPath->Dropped()) {
        auto indexInfo = context.SS->Indexes.find(childPathId);
        ...

the childPath->PathType != EPathTypeTableIndex check prevents the body from running for actual index children, so the subsequent lookup in Indexes is effectively dead. To actually invalidate caches for index impl tables, the condition should be the opposite:

-    if (childPath->PathType != NKikimrSchemeOp::EPathTypeTableIndex && !childPath->Dropped()) {
+    if (childPath->PathType == NKikimrSchemeOp::EPathTypeTableIndex && !childPath->Dropped()) {

Without this fix, describe-path caches for index impl tables won’t be cleared or published here, which can delay metadata convergence, even though other paths may eventually refresh them.

Also applies to: 183-203

🤖 Prompt for AI Agents
ydb/core/tx/schemeshard/schemeshard__operation_backup_backup_collection.cpp
around lines 133-181 and 183-203: the loop that should invalidate caches for
index implementation tables uses an inverted condition (checks
childPath->PathType != EPathTypeTableIndex), which skips real index children and
makes the subsequent Indexes lookup dead; change the condition to require
childPath->PathType == NKikimrSchemeOp::EPathTypeTableIndex and ensure you still
skip dropped children (childPath->Dropped()), then proceed to find the index in
context.SS->Indexes by childPathId, get the implementation table child, and
invalidate/publish the describe-path cache for that impl table as originally
intended.

@Enjection Enjection force-pushed the claude/plan-backup-indexes-011CUsry2xyTV6Y17uNuNUZM branch from 010417d to f3e9a0b Compare November 20, 2025 05:53
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
INCREMENTAL_BACKUP_INDEX_PROGRESS.md (1)

1-260: This appears to be intermediate investigation documentation—clarify purpose and consolidate with related files.

This file thoroughly documents the investigation journey, but it reads as working notes rather than finalized documentation. It includes failed attempts (#1 and #2) and indicates Phase 2 analysis is "in progress." Several points need clarification:

  1. Are the issues documented here (race conditions, version mismatches) actually fixed in this PR's implementation, or are they still pending?
  2. The "Next Steps (To Be Implemented)" section on line 75 proposes version synchronization—has this been implemented in the code, or is it future work?
  3. This file significantly overlaps with VERSION_SYNC_PLAN.md and INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md (all cover similar root causes and solution attempts).

Recommendations:

  • Consolidate these three investigation documents into a single IMPLEMENTATION_STATUS.md or INVESTIGATION_NOTES.md with clear sections for: Problem → Root Cause → Attempted Solutions → Final Resolution → Remaining Issues.
  • Add explicit status indicators (✅ Fixed / ⏳ Pending / ❌ Blocked) for each issue.
  • If this PR includes fully working implementation, move detailed failed attempts to a separate INVESTIGATION_HISTORY.md to keep the main documentation focused on what was delivered.
  • Add language specifiers to code blocks (lines 5, 66, 110, 117, 123, 153) to resolve markdown linting warnings.
VERSION_SYNC_PLAN.md (1)

1-280: Clarify whether this is a research plan that was executed or a proposal for future work.

This document presents a detailed 6-phase research and fix plan, but it's unclear whether these steps were actually executed or if they're intended as a roadmap. Key concerns:

  1. Phase Execution Status: Phases 1-3 outline diagnostic logging and code search tasks. Were these completed? If so, include the findings.
  2. Phase 5 Options: Three implementation approaches (A: Update Index Metadata, B: Increment Main Table Version, C: Send Explicit Version Update) are presented, but there's no indication of which was selected or why.
  3. Document Scope: This overlaps significantly with INCREMENTAL_BACKUP_INDEX_PROGRESS.md (both analyze schema version mismatch) and INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md (both propose solutions).

Recommendations:

  • If this plan was executed: Update the document to summarize actual findings from each phase rather than presenting hypothetical investigation steps.
  • If this is future work: Move it out of the PR and into an issue/planning document; the final PR should document what was implemented, not what might be done.
  • Add language specifiers to all code blocks (e.g., ```bash, ```cpp) to resolve markdown linting errors.
  • Consolidate with related investigation files to avoid maintaining three parallel documents of similar content.
INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md (2)

420-512: Consolidate detailed technical summary with other investigation files; consider moving to single implementation summary document.

Lines 420-512 provide an excellent "How It Works" walkthrough of the three-phase CDC creation flow. However, similar explanations appear in other files. Key redundancies:

  • Problem description (lines 3-17): Also in INCREMENTAL_BACKUP_INDEX_PROGRESS.md and VERSION_SYNC_PLAN.md
  • Root cause analysis (lines 18-29): Overlaps with other files
  • Solution approaches and failures (lines 30-150): Similar coverage in INCREMENTAL_BACKUP_INDEX_PROGRESS.md

Recommendations:

  • Create a single IMPLEMENTATION_COMPLETE.md or DESIGN_SUMMARY.md that consolidates:
    • Final problem statement
    • Root cause (single version, not repeated across 3 files)
    • Actual implementation approach that was chosen
    • How it works (the excellent walkthrough in lines 420-512)
    • Verification results
  • Move detailed failure analysis and investigation steps to optional INVESTIGATION_HISTORY.md for maintainer reference
  • This keeps active documentation focused and reduces maintenance burden.

485-485: Fix markdown linting issues: add language specifiers and resolve duplicate headings.

Several markdown linting warnings need resolution:

  1. Missing language specifiers on code blocks: Lines with ``` should specify language (e.g., ```cpp, ```protobuf, ```bash). Affects lines with code examples throughout.
  2. Duplicate heading "Testing": Appears at lines 485 and 563—consolidate or rename one.
  3. Duplicate heading "Status" (implicit): Line 548 has heading, and status is scattered through document.

Fix:

  • Add language identifiers to all code blocks
  • Rename second "Testing" section to "Test Case Details" or "Test Scenarios"
  • Consolidate "Current Status" section (line 548) with the overall "Status" context

Also applies to: 513-513, 548-548, 563-563

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 010417d and f3e9a0b.

📒 Files selected for processing (3)
  • INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md (1 hunks)
  • INCREMENTAL_BACKUP_INDEX_PROGRESS.md (1 hunks)
  • VERSION_SYNC_PLAN.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
INCREMENTAL_BACKUP_INDEX_PROGRESS.md

[style] ~101-~101: Consider a different adjective to strengthen your wording.
Context: ...ion - Phase 2 Analysis In Progress: Deep investigation of version synchronizatio...

(DEEP_PROFOUND)


[style] ~242-~242: Consider using “incomplete” to avoid wordiness.
Context: ...g**: SyncChildIndexes runs async; maybe not complete before query? 2. Code path: Maybe S...

(NOT_ABLE_PREMIUM)


[style] ~243-~243: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...omplete before query? 2. Code path: Maybe SyncChildIndexes isn't being called for...

(REP_MAYBE)


[style] ~244-~244: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...d for backup CDC streams? 3. Cache: Maybe KQP's cache isn't being invalidated for...

(REP_MAYBE)


[style] ~245-~245: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...ted for the index? 4. Wrong target: Maybe sync is updating wrong index or wrong v...

(REP_MAYBE)

🪛 markdownlint-cli2 (0.18.1)
VERSION_SYNC_PLAN.md

5-5: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


110-110: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


117-117: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


123-123: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


153-153: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

INCREMENTAL_BACKUP_INDEX_PROGRESS.md

5-5: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


110-110: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


117-117: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


123-123: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


153-153: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md

7-7: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


45-45: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


73-73: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


169-169: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


180-180: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


513-513: Multiple headings with the same content

(MD024, no-duplicate-heading)


548-548: Multiple headings with the same content

(MD024, no-duplicate-heading)


563-563: Multiple headings with the same content

(MD024, no-duplicate-heading)

🔇 Additional comments (1)
INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md (1)

1-50: ⚠️ CRITICAL: Documentation claims race condition is unfixed and Approach 3 is not implemented—verification confirms this is accurate but contradicts PR intent.

The document accurately reflects the current codebase state. Verification of the implementation shows:

  • schemeshard__operation_incremental_restore_finalize.cpp: Only uses asynchronous PublishToSchemeBoard() (fire-and-forget), no PublishAndWaitPublication() or TEvCompletePublication handlers present.
  • Barrier mechanism exists elsewhere: Other operations (TAlterExtSubDomain, TDropIndexedTable, TDropTable, TMoveTableIndex, TMoveTable, TMoveSequence) successfully use the PublishAndWaitPublication() barrier pattern with proper TEvCompletePublication handlers and wait logic.
  • Approach 3 is genuinely not implemented: No RegisterWaitPublication(), no CountWaitPublication() checks, no barrier state logic in the incremental restore finalize operation.

This means:

  1. The race condition (lines 1-40) described in the document remains unresolved in this PR.
  2. Document accurately reflects reality—it is not outdated.
  3. Either the PR is incomplete and Approach 3 implementation is missing, or the PR objectives have changed.

Clarification needed: Is this PR intended to deliver only the analysis and recommendations without the actual fix implementation? If the fix is expected to be included, TFinalizationPropose requires the same barrier pattern that already works in other operations: add PublishAndWaitPublication() calls, implement HandleReply(TEvCompletePublication::TPtr&), and check CountWaitPublication() in ProgressState().

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (14)
CDC_VERSION_SYNC_RESEARCH_SUMMARY.md (2)

164-164: Fix grammar: missing hyphen in "operation structure" phrase.

Line 164 mentions "More complex operatio" (cut off in static analysis hint). This appears to be a markdown formatting issue where text wrapped awkwardly. Review the full context around line 164 to ensure proper phrasing and clarity.


255-255: Add missing comma after year in date.

Line 255 should read: **Date:** January 20, 2025, **Status:** (or restructure) per standard date formatting in formal documents.

DATASHARD_VERSION_VALIDATION_ANALYSIS.md (3)

20-44: Excellent code analysis, but add language identifiers to code blocks for clarity.

Lines 20-44 and throughout the document contain bare code fences without language identifiers. Change:

Y_VERIFY_DEBUG_S(...)

to:

Y_VERIFY_DEBUG_S(...)

This is a minor markdown hygiene issue that improves readability in documentation systems. The static analysis tool flagged this at multiple locations (lines 155, 226, 235, 367, etc.).


175-210: Query version check analysis is clear but could address one edge case.

The analysis of query execution (section 3) correctly identifies that queries use SchemeBoard, not direct datashard versions. However, consider adding:

  • Behavior when SchemeBoard cache is stale (e.g., new node joining cluster)
  • Whether version checks are transactional or can change mid-query

These edge cases don't seem to affect the "converge later" safety argument, but explicitly addressing them would strengthen the analysis.


270-301: Test coverage analysis identifies real gaps, but recommendations could be more specific.

Section 5.3 recommends adding tests for "version consistency validation after CDC completes" and "concurrent queries during CDC operation". These are good, but consider specifying:

  1. How to trigger concurrent queries (query thread vs. CDC thread timing)
  2. What specific assertions to check (exact version numbers, not just "no errors")
  3. Whether existing test framework supports this (or needs new instrumentation)

Without these specifics, the testing recommendations may be harder for developers to implement.

cdc_version_sync_design.md (3)

1-100: Well-structured design document with clear problem statement and executive summary.

The document systematically presents the race condition problem and evaluates 5 strategies. Strengths:

  • Problem statement with concrete timeline showing race (lines 29-38)
  • Clear explanation of why current sync fails (lines 241-281)
  • Five distinct strategies with pros/cons

Consider: Is this document primarily for architects/decision-makers or for implementers? If implementers, consider adding a quick "Start here" section directing readers to the recommended strategy (Strategy E) rather than requiring them to read all 5 strategies first.


302-406: Add language identifiers to all code fences per markdown best practices.

Lines 312, 330, 336, 367, 376, etc. contain bare code blocks. Should be:

// Example code

Also consider: Are these pseudocode or actual C++ that will compile? If pseudocode, consider marking explicitly (e.g., // Pseudocode: comment) to set reader expectations.


1000-1099: Recommendation for Strategy E is clear but could explicitly address the "Why not A?" question.

Section 7 recommends Strategy E but doesn't explicitly state why Strategy A wasn't chosen. The cost/benefit analysis (section 4) favors E on performance, but the decision section could strengthen this by saying something like:

"Strategy A was rejected not due to safety concerns (both are safe) but due to unnecessary complexity. Strategy E achieves the same correctness with 50% less code and zero additional latency. Strategy A would only be reconsidered if testing reveals N² worst-case writes are a practical problem (not expected with typical tables)."

This would help readers understand the decision was optimization-based, not safety-based.

ss2.md (2)

76-92: Example use of "Step 1-3" headers: Use markdown heading instead of bold emphasis.

Lines 76-92 show headers using bold (**Step 1: ConstructParts Factory**) instead of markdown headings (### Step 1: ConstructParts Factory). This is flagged by static analysis at line 76 (MD036). While both render, using proper headings improves document structure and navigation.


251-281: Parts Vector restoration explanation is clear but could add real-world troubleshooting guidance.

The explanation at lines 246-281 clearly shows that Parts vector is NOT restored after crash, only active TTxState entries are restored. This is important for developers to understand.

Consider adding a troubleshooting section: "If your operation restarts after crash and parts don't resume, check that all context is stored in TTxState, not just in part-local memory."

strategy_e_implementation_research.md (4)

1-65: Comprehensive implementation research, but documents needs structure improvements for readability.

This is an excellent, detailed implementation guide for Strategy E lock-free helping coordination. The amount of content (1770 lines) is impressive and thorough. However, consider:

  1. Table of Contents: Add at top for navigation (readers need to know what's covered)
  2. Quick Start: Add pointer for developers who want to just implement (vs. understand every detail)
  3. Prerequisites: Section on what readers should know (lock-free algorithms? YDB internals?)

These additions would make the document more usable without reducing its depth.


30-46: Verification section is valuable but could be more explicit about verification methodology.

Lines 30-46 show "VERIFICATION STATUS: All file locations, classes, and functions verified" but doesn't explain how verification was done. Was this:

  • Manual code inspection?
  • Automated grep/ast-grep search?
  • Reference to specific commit hashes?

Adding this would help readers assess confidence level and reproduce verification if needed.


238-294: Critical "Why Current Sync Fails" section is excellent for understanding the problem.

Lines 240-294 clearly explain the race condition in existing code. The timeline format (T1-T8) makes the problem concrete and visual. This is excellent documentation that should definitely be preserved in final design doc.

However, one improvement: Add a "This bug manifests as:" section explaining what users see (SCHEME_CHANGED errors? Version mismatches in query results?). Currently the document explains the technical issue but not the user-visible symptoms.


1088-1150: Step-by-step integration guide is extremely helpful, but could be more specific about testing.

Phase 4 (lines 1164-1202) recommends adding diagnostic tests but leaves specifics to the reader. Consider providing:

  1. Actual test code template they can fill in
  2. Specific assertions (not just "verify all at same version")
  3. How to simulate concurrent CDC operations in test

Without these specifics, developers may write tests that don't actually catch race conditions.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3e9a0b and a94871f.

📒 Files selected for processing (5)
  • CDC_VERSION_SYNC_RESEARCH_SUMMARY.md (1 hunks)
  • DATASHARD_VERSION_VALIDATION_ANALYSIS.md (1 hunks)
  • cdc_version_sync_design.md (1 hunks)
  • ss2.md (1 hunks)
  • strategy_e_implementation_research.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
cdc_version_sync_design.md

[grammar] ~932-~932: Ensure spelling is correct
Context: ...kup ### 6.2 Key Data Structures **TOperation:** File: ydb/core/tx/schemeshard/schemeshard__operation.h Lines: 10-157 struct TOperation { const TTxId TxId; TVectorISubOperation::TPtr Parts; // Operation parts (parallel execution) TSet DoneParts; // Completed parts THashMap<TString, TSet> Barriers; // Barrier name -> blocked parts // ... }; **TTableInfo:** File: ydb/core/tx/sche...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

strategy_e_implementation_research.md

[style] ~92-~92: This phrase is redundant. Consider writing “point” or “time”.
Context: ...perties #### Progress Guarantee At any point in time, at least one operation can make progre...

(MOMENT_IN_TIME)


[grammar] ~1320-~1320: Use a hyphen to join words.
Context: ...siblings, normal helping case 3. Self catching up: ``` [DEBUG] HelpSyncSibli...

(QB_NEW_EN_HYPHEN)

CDC_VERSION_SYNC_RESEARCH_SUMMARY.md

[grammar] ~164-~164: Ensure spelling is correct
Context: ...f concerns - ⚠️ Additional latency (~50-100ms for barrier) - ⚠️ More complex operatio...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[style] ~255-~255: Some style guides suggest that commas should set off the year in a month-day-year date.
Context: ...:** AI Analysis Date: January 20, 2025 Status: ✅ COMPLETE - All findings...

(MISSING_COMMA_AFTER_YEAR)

🪛 markdownlint-cli2 (0.18.1)
DATASHARD_VERSION_VALIDATION_ANALYSIS.md

155-155: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


226-226: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


235-235: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


367-367: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

ss2.md

76-76: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


251-251: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


262-262: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


388-388: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

cdc_version_sync_design.md

24-24: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


30-30: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


67-67: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


89-89: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


114-114: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


126-126: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


152-152: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


203-203: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


217-217: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


228-228: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


235-235: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


250-250: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


265-265: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


286-286: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


312-312: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


561-561: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


833-833: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


877-877: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


888-888: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


901-901: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


915-915: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


924-924: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


933-933: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


947-947: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


957-957: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


967-967: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


983-983: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


988-988: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1004-1004: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1020-1020: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1148-1148: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1158-1158: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1177-1177: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1189-1189: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

strategy_e_implementation_research.md

74-74: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


131-131: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


132-132: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


141-141: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


142-142: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


152-152: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


153-153: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


267-267: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


317-317: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


376-376: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


804-804: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


839-839: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


850-850: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


891-891: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


929-929: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


936-936: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


942-942: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1090-1090: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1104-1104: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1125-1125: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1149-1149: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1164-1164: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1203-1203: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1211-1211: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1302-1302: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1311-1311: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1321-1321: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1388-1388: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1390-1390: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1409-1409: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1411-1411: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1431-1431: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1451-1451: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1463-1463: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1507-1507: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1523-1523: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1572-1572: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


1752-1752: Bare URL used

(MD034, no-bare-urls)


1753-1753: Bare URL used

(MD034, no-bare-urls)

🔇 Additional comments (9)
CDC_VERSION_SYNC_RESEARCH_SUMMARY.md (1)

1-256: Executive summary is well-structured but document depends on verification of critical datashard claims.

This document makes definitive safety claims about datashard behavior (e.g., Y_VERIFY_DEBUG_S being DEBUG-only). While section 2.2 of DATASHARD_VERSION_VALIDATION_ANALYSIS.md provides supporting analysis, this document would benefit from either:

  1. Inline citations to the specific datashard code locations (e.g., line 1725 of datashard.cpp)
  2. A "Verified" section showing the evidence for each critical claim

The structure and recommendations (sections 4-7) are clear and actionable. Consider this document ready for circulation to architects as a summary, but verify the datashard behavior claims are accurate before committing to the implementation strategy.

DATASHARD_VERSION_VALIDATION_ANALYSIS.md (1)

86-99: Critical claim about Y_VERIFY_DEBUG_S requires one-time verification.

This section claims that Y_VERIFY_DEBUG_S at line 1725 is a "DEBUG-only check" with no-op in release builds. This is the linchpin of your entire safety argument. Before finalizing the implementation strategy:

  1. Confirm Y_VERIFY_DEBUG_S macro definition in YDB codebase
  2. Verify it uses #ifdef, not runtime flag
  3. Document the macro location for future reference

Once verified, this is excellent evidence. If it's wrong, the entire strategy may need revision.

cdc_version_sync_design.md (1)

737-818: Strategy E lock-free helping implementation looks sound, but needs one clarification.

The pseudocode at line 749-817 shows the helping pattern clearly. One question for validation:

At line 787, the code checks if (maxVersion > myVersion) and updates self. But what if another operation just incremented myVersion to match maxVersion between when we computed max and when we try to update?

Idempotency should handle this (writing same value twice is safe), but explicitly stating "idempotent because we use assignment (=) not increment (+=)" would clarify this for readers.

ss2.md (1)

1-50: Excellent foundational documentation, but unclear if this belongs in the incremental restore PR.

This document appears to be comprehensive SchemeShard architecture documentation. It's well-written and provides valuable context, but the PR objectives focus specifically on "implement incremental restore for global indexes."

Questions:

  1. Is this new documentation being added as supporting material for the PR, or existing documentation being refactored?
  2. Should this be in a separate documentation PR?
  3. Is it necessary for understanding the incremental restore changes, or is it background material?

If essential to the incremental restore changes, it should stay. If background material, consider moving to a separate documentation PR to keep this PR focused.

strategy_e_implementation_research.md (5)

407-492: REPLACEMENT CODE section is well-structured but needs verification before use.

Lines 407-492 show the replacement code for TProposeAtTable::HandleReply. This looks good but should be verified:

  1. Does BuildTableVersionContext still exist at these lines?
  2. Is the IsPartOfContinuousBackup flag available?
  3. Have the persistence/publish functions changed?

Before implementation, compare this code against current YDB codebase to ensure it will compile. Consider adding a code review checkpoint here.


509-722: HelpSyncSiblingVersions implementation is detailed but pseudocode vs production quality unclear.

Lines 509-722 provide extensive pseudocode for the helping synchronization function. The logic is clear and well-commented. However:

  1. Error handling: What happens if PathsById doesn't contain an expected entry? Current code has at() which throws, but maybe should use contains() check?
  2. Logging overhead: Extensive logging is good for debugging, but 10+ LOG_* calls per operation may impact production performance. Consider making some DEBUG-level or conditional.
  3. Actual implementation: Is this pseudocode meant to be copy-pasted, or is it for understanding the algorithm?

Clarify expectations for developers using this pseudocode.


1220-1277: Unit test examples provided are valuable but use fictional APIs.

Lines 1226-1276 show unit test pseudocode using TTestEnv, CreateTableWithIndexes, TriggerIncrementalBackup, etc. These look like they might be real YDB test APIs, but without seeing the actual test files, I can't verify they exist.

Before implementation:

  1. Verify these test helper functions exist
  2. Check if any need to be created
  3. Ensure test environment matches actual YDB test framework

These test examples are excellent templates but need validation against actual YDB test infrastructure.


1320-1347: Debugging strategies section is practical and well-organized.

Lines 1320-1347 show realistic log output patterns and how to interpret them. This is excellent production debugging guidance. The three scenarios (successful sync, helping sync, self catching up) are clear and concrete.

This section should be added to internal documentation/runbooks so on-call engineers can use it when troubleshooting version sync issues.


1572-1661: Conclusion clearly states this is NOT optional - it's a required fix for existing race condition.

Lines 1572-1661 make a strong case that Strategy E implementation is necessary (not optional) because:

  1. Existing code has proven race condition
  2. Current SyncImplTableVersion is fundamentally broken
  3. Strategy E is the fix

This framing is important and correct. The conclusion correctly positions this as fixing a bug, not adding a feature. However, verify that the "proven race condition" claim is well-documented in the earlier sections for readers who skip to the conclusion.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (2)

571-596: Consider making version diagnostics conditional.

The diagnostics block outputs detailed schema version information to stderr unconditionally. While helpful for debugging, this creates verbose test output.

Consider wrapping this in a conditional based on a test verbosity flag or only enabling it on test failure:

-        // Add version diagnostics after backup
-        {
-            Cerr << "========== VERSION DIAGNOSTICS AFTER BACKUP ==========" << Endl;
-            
-            auto describeTable = Ls(runtime, edgeActor, "/Root/TableWithIndex");
-            if (!describeTable->ResultSet.empty() && describeTable->ResultSet[0].TableId) {
-                Cerr << "Main table SchemaVersion: " << describeTable->ResultSet[0].TableId.SchemaVersion << Endl;
+        // Add version diagnostics after backup for debugging
+        if constexpr (false) {  // Enable for debugging
+            Cerr << "========== VERSION DIAGNOSTICS AFTER BACKUP ==========" << Endl;
+            
+            auto describeTable = Ls(runtime, edgeActor, "/Root/TableWithIndex");

3091-3091: Consider explicit CDC stream initialization verification.

The tests use fixed sleep times (5 seconds) after full backups to allow CDC streams to initialize. While SimulateSleep is deterministic in tests, consider whether there are explicit synchronization points that could be used instead of time-based waits.

This is not critical since the current approach works, but for improved clarity and maintainability, consider:

  • Adding a helper function to verify CDC stream creation status
  • Using event-based waiting similar to WaitTxNotification

Example pattern:

// Instead of:
SimulateSleep(server, TDuration::Seconds(5));

// Consider:
WaitForCdcStreamReady(server, edgeActor, "/Root/TableWithIndex");

This would make the tests more self-documenting and robust.

Also applies to: 3329-3329

ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp (2)

117-274: Consider breaking down the function for improved maintainability.

HelpSyncSiblingVersions is 159 lines long and performs 6 distinct steps. While the current implementation is clear with step-by-step comments, extracting helper functions for collection (steps 1-2) and update (steps 4-5) logic could improve testability and maintainability.

For example:

  • CollectIndexFamily() - Steps 1-2
  • ComputeMaxVersion() - Step 2
  • UpdateIndexEntities() - Steps 4-5

373-373: Minor: Redundant namespace qualification.

The call NCdcStreamState::SyncIndexEntityVersion() is already within the NCdcStreamState namespace (line 9), so the namespace prefix is redundant. You can call SyncIndexEntityVersion() directly for cleaner code.

-        NCdcStreamState::SyncIndexEntityVersion(childPathId, targetVersion, operationId, context, db);
+        SyncIndexEntityVersion(childPathId, targetVersion, operationId, context, db);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a94871f and 8d41699.

📒 Files selected for processing (2)
  • ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (3 hunks)
  • ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp (6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (1)
ydb/core/tx/schemeshard/ut_backup_collection/ut_backup_collection.cpp (2)
  • SetupLogging (14-17)
  • SetupLogging (14-14)
🔇 Additional comments (9)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (4)

3111-3160: Well-designed verification of OmitIndexes behavior.

The test properly uses dynamic backup directory resolution and DescribePath verification to confirm that:

  1. The main table incremental backup was created correctly
  2. The index metadata directory does NOT exist when OMIT_INDEXES is set

This approach is more robust than relying on query failures and provides clear verification of the expected behavior.


3162-3274: Comprehensive test coverage for index restore.

The new test suite provides excellent coverage of incremental restore with indexes:

  • Single index (BasicIndexIncrementalRestore)
  • Multiple indexes (MultipleIndexesIncrementalRestore)
  • Cross-shard scenarios (IndexDataVerificationIncrementalRestore)
  • Sequential incremental backups (MultipleIncrementalBackupsWithIndexes)
  • Multiple tables (MultipleTablesWithIndexesIncrementalRestore)

Each test properly verifies:

  1. Table data correctness
  2. Index functionality via VIEW queries
  3. Index implementation table data integrity
  4. Proper handling of updates and deletes

2929-2933: Verify CDC operation counts remain stable.

The test asserts specific CDC operation counts (2 DELETEs, 1 INSERT) parsed from metadata. While appropriate for verifying exact CDC behavior, these assertions are tightly coupled to the current implementation.

If CDC operation generation logic changes in the future (e.g., optimizations that combine operations or different tombstone strategies), these assertions may need updates. Consider documenting why these specific counts are expected:

// Expected counts based on:
// - 2 DELETEs: tombstones for old values at age 25 (Bob's original) and 26 (Bob's intermediate update)
// - 1 INSERT: new value for Alice2
UNIT_ASSERT_EQUAL_C(counts.Deletes, 2, "Should have 2 DELETE operations (tombstones for age 25 and 26)");

3162-3766: Excellent test coverage for index incremental restore feature.

The test suite comprehensively validates the implementation described in the PR objectives:

  1. Index discovery: Tests verify that indexes are properly discovered and backed up in __ydb_backup_meta/indexes/ directory structure
  2. Restore operations: Each test verifies indexes work after restore by:
    • Querying via VIEW to confirm index functionality
    • Directly querying index implementation tables to verify data integrity
    • Checking row counts match expected values
  3. Various scenarios: Coverage includes single indexes, multiple indexes, covering indexes, multi-shard tables, sequential incremental backups, and multiple tables

The tests properly verify both the functional behavior (indexes work) and the internal state (impl tables have correct data), which gives confidence that the restore logic correctly reconstructs the index structures.

ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp (5)

2-2: LGTM - Include for new CDC stream common functionality.

The new include provides the necessary forward declarations and API exposure for the Strategy E implementation.


13-114: LGTM - Well-structured helper functions for continuous backup detection.

The helper functions provide clear separation of concerns for detecting continuous backup streams and building context. The logic for traversing parent/grandparent relationships is correct.


279-342: LGTM - Enhanced logging for version synchronization.

The extensive logging additions will help diagnose version synchronization issues. The core logic correctly prevents version downgrades with the targetVersion > oldIndexVersion check.


492-546: LGTM - Strategy E implementation with proper flow separation.

The logic correctly separates three paths:

  1. Index impl table CDC (lines 497-531): Increments own version atomically, then uses lock-free helping to sync sibling index entities
  2. Other tables (lines 532-541): Simple version increment
  3. Main table CDC (lines 544-546): Additionally syncs child index entities

The separation ensures that index impl tables handle their own version updates (avoiding datashard schema mismatches), while index entities are synchronized by whoever observes the need.


115-274: Verify operation execution model to assess race condition validity.

The review comment correctly identifies a theoretical race condition, but the actual risk depends on YDB's operation scheduling model. The code shows:

  1. All operations process through message handlers (HandleReply)
  2. YDB uses an actor model with tablet executors
  3. If operations on the same parent table serialize through the tablet's message queue, no concurrency exists
  4. If operations can interleave, the version comparison alone doesn't prevent lost updates

The version check (if (index->AlterVersion < maxVersion)) prevents downgrades but not lost updates in a concurrent scenario:

  • Operation A reads v=5, Operation B reads v=5
  • Both compute maxVersion=6, both write 6
  • A concurrent third operation's version increment gets lost

Questions for verification:

  1. Do all CDC operations on indexes of the same parent table serialize through a single schemeshard tablet's message queue, or can they execute concurrently on different threads?
  2. Are there any async update paths or background tasks that could modify Indexes or Tables structures while HelpSyncSiblingVersions executes?
  3. Does the existing test suite verify concurrent CDC operations on multiple indexes of the same table, and do they pass?

Comment on lines +162 to +163
Y_ABORT_UNLESS(indexPath->GetChildren().size() == 1);
auto [implTableName, implTablePathId] = *indexPath->GetChildren().begin();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Replace hard assertion with graceful validation.

The Y_ABORT_UNLESS will crash the schemeshard if an index entity doesn't have exactly one child. This assumption could be violated during concurrent modifications, index deletion, or corrupted state.

Apply this diff to handle the edge case gracefully:

         // Get index impl table (single child of index entity)
         auto indexPath = context.SS->PathsById.at(childPathId);
-        Y_ABORT_UNLESS(indexPath->GetChildren().size() == 1);
+        if (indexPath->GetChildren().size() != 1) {
+            LOG_WARN_S(context.Ctx, NKikimrServices::FLAT_TX_SCHEMESHARD,
+                       "Index entity does not have exactly one child, skipping"
+                       << ", indexPathId: " << childPathId
+                       << ", childCount: " << indexPath->GetChildren().size()
+                       << ", at schemeshard: " << context.SS->SelfTabletId());
+            continue;
+        }
         auto [implTableName, implTablePathId] = *indexPath->GetChildren().begin();

Committable suggestion skipped: line range outside the PR's diff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants