-
Notifications
You must be signed in to change notification settings - Fork 0
Implement incremental restore for global indexes #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Implement incremental restore for global indexes #19
Conversation
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds CDC stream creation for main and index-impl tables during incremental backups, extends protobuf with index CDC metadata and OmitIndexes, implements version synchronization strategies (notably lock-free "helping"), introduces index-aware incremental restore and finalization phases, and adds extensive tests and design documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant SchemeShard
participant BackupOp as BackupCollection<br/>Operation
participant CDC as CDC Stream<br/>Creation
participant CopyOp as Copy Table<br/>Operation
participant PQ as PQ Parts<br/>Creation
BackupOp->>CDC: CreateCdcStream(main table)
CDC-->>BackupOp: CDC stream for main table
BackupOp->>CDC: CreateCdcStream(index impl tables)
CDC-->>BackupOp: CDC streams for each index impl table
BackupOp->>BackupOp: Persist CDC metadata to descriptor
BackupOp->>CopyOp: CopyTableTask with OmitIndexes=true
CopyOp->>CopyOp: Set CDC stream on main table copy
CopyOp->>CopyOp: For each index impl table,<br/>set CDC stream from descriptor
CopyOp-->>BackupOp: Copy operations configured
BackupOp->>PQ: CreatePqParts(main CDC stream)
BackupOp->>PQ: CreatePqParts(per-index CDC streams)
PQ-->>BackupOp: PQ parts created
sequenceDiagram
participant DataShard
participant SchemeShard
participant CdcOp as CDC Propose<br/>At Table
participant IndexParent as Parent Index<br/>Entity
participant ImplTable as Index Impl<br/>Table
CdcOp->>DataShard: Propose CDC creation
DataShard->>DataShard: Increment AlterVersion (self)
DataShard->>SchemeShard: TEvProposeTransactionResult
SchemeShard->>CdcOp: HandleReply
CdcOp->>CdcOp: HelpSyncSiblingVersions(myVersion)
CdcOp->>CdcOp: Collect related: index entity + impl table
CdcOp->>CdcOp: Compute max version across all three
CdcOp->>IndexParent: Update if behind
CdcOp->>ImplTable: Update if behind
CdcOp->>SchemeShard: Persist all changes in single txn
CdcOp->>SchemeShard: Publish to SchemeBoard
sequenceDiagram
participant RestoreOp as Incremental<br/>Restore Scan
participant IndexDiscover as Index<br/>Discovery
participant TableRestore as Table Restore<br/>Operation
participant IndexRestore as Index Restore<br/>Operation
participant Finalize as Incremental<br/>Restore Finalize
RestoreOp->>RestoreOp: CollectTargetTablePaths
RestoreOp->>RestoreOp: Also collect index impl tables
RestoreOp->>IndexDiscover: DiscoverAndCreateIndexRestoreOperations
IndexDiscover->>IndexDiscover: DiscoverIndexesRecursive(backup path)
IndexDiscover->>IndexDiscover: Find target table mapping
IndexDiscover->>IndexRestore: CreateSingleIndexRestoreOperation(per index)
IndexRestore-->>RestoreOp: Index restore ops queued
RestoreOp->>TableRestore: Create table restore ops (parallel)
TableRestore-->>RestoreOp: Table restore ops queued
RestoreOp->>Finalize: Coordinate finalization
Finalize->>Finalize: TConfigureParts phase:<br/>collect index impl tables
Finalize->>Finalize: Increment index AlterVersions
Finalize->>Finalize: SyncIndexSchemaVersions
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (22)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (2)
219-236:FindIncrementalBackupDirhelper is reasonableThe helper uses DescribePath with
ReturnChildren=trueand picks the first child ending with_incremental, which fits the test usage where only one incremental backup is expected per collection. Returning""and asserting at call sites gives clear failures.If you ever introduce scenarios with multiple incremental backups per collection, consider selecting the latest child (lexicographically max timestamp) instead of the first match.
511-630:SimpleBackupRestoreWithIndexprovides strong end-to-end coverageThis test nicely validates:
- Pre-backup index usability via
VIEW idx.- Full backup & restore of a table with a global index.
- Post-restore index behavior (query via view) and the index implementation table row count.
The additional version diagnostics using
Lsare harmless and will help debugging if something regresses. The fixed sleeps (5s) are a bit heavy but acceptable in this suite.ydb/core/tx/datashard/datashard_ut_common_kqp.h (1)
6-6: KqpSimpleExecSuccess helper and test context wiringThe new
KqpSimpleExecSuccesswrapper plus#include <ydb/library/ut/ut.h>cleanly centralize “assert SUCCESS + format result” behavior for KQP tests. TheTTestContext testCtxparameter is currently not passed intoCTX_UNIT_ASSERT_VALUES_EQUAL_C, so this relies on TTestContext RAII semantics rather than an explicit context argument; if CTX_* macros actually expect a context parameter, please threadtestCtxthrough, otherwise consider marking it explicitly unused to avoid warnings.Also applies to: 186-192
ydb/core/tx/datashard/ut_common/datashard_ut_common.cpp (1)
2098-2104: ExecSQL now explicit about expected status and uses CTX_ assertions*Extending
ExecSQLwith an explicitYdb::StatusIds::StatusCode codeand switching toCTX_UNIT_ASSERT_VALUES_EQUAL_Cmakes the expectation clearer and integrates with the new test‑context machinery. As with the KQP helper,TTestContext testCtxis not passed to the macro, so this depends on RAII semantics in ut.h; please confirm that this is intentional and that all ExecSQL call sites have been updated (or consider adding a convenience overload that defaultscodetoSUCCESS).Also applies to: 2111-2113
ydb/core/protos/flat_scheme_op.proto (1)
349-351: Proto extensions for index skipping and per‑index CDC look consistentAdding
optional bool OmitIndexes = 47 [default = false];toTTableDescriptionandmap<string, TCreateCdcStream> IndexImplTableCdcStreams = 9;toTCopyTableConfigis wire‑compatible and matches the incremental backup design (skip impl copies and track CDC streams per index impl table). It would be helpful to keep the semantics ofOmitIndexesconsistent with the similarly named flags in backup collection configs and to clearly document whenIndexImplTableCdcStreamsis expected to be filled (e.g., only for incremental backups with CDC).Also applies to: 1288-1291
ydb/core/tx/schemeshard/schemeshard_cdc_stream_common.h (1)
13-14: Make dependencies for new CDC state helpers explicit
SyncIndexEntityVersion/SyncChildIndexeslook fine, but this header now depends onNIceDb::TNiceDbandTPathElement::TPtrwithout declaring them locally. To reduce reliance on transitive includes, consider either:
- Adding a lightweight forward declaration for
namespace NIceDb { class TNiceDb; }andstruct TPathElement;, or- Including the minimal headers that define these types (e.g.
schemeshard_path_element.h/ the TNiceDb declaration header).This keeps the CDC state helpers robust against include-order changes.
Also applies to: 35-54
INCREMENTAL_BACKUP_INDEX_PROGRESS.md (1)
73-97: Clarify what is implemented vs. exploratory in the progress logThis document is very detailed and useful, but sections like “Next Steps (To Be Implemented)” and “Phase 2 Analysis In Progress” now coexist with actual implementation work in this PR (e.g., index version synchronization on restore). It would help future readers if you:
- Explicitly mark which parts of the “Proposed Solution: Version Synchronization” are implemented in code vs. still speculative, and
- Optionally add a short “Current status” summary at the top, reflecting the behavior after this PR.
Also applies to: 98-102
ydb/core/tx/datashard/ut_common/datashard_ut_common.h (1)
15-16: ExecSQL test-context extension looks goodAdding
ydb/library/ut/ut.hand extendingExecSQLwith an optionalTTestContextkeeps existing callers working while enabling richer diagnostics in new tests. This is a sensible way to thread context into the helper.If
TTestContextever becomes heavier to copy, you might switch the parameter toconst TTestContext& testCtx = TTestContext()to avoid unnecessary copies, but for current unit-test usage this is purely optional.Also applies to: 809-815
ydb/core/tx/schemeshard/schemeshard__operation_consistent_copy_tables.cpp (2)
171-213: Consider dialing back or gating verbose NOTICE logging in index copy pathThe new LOG_NOTICE blocks around:
- Table-level details (
srcPath,dstPath,childrenCount,omitIndexes),- Table-index descriptions, and
- Per-child decisions (deleted / sequence / non-index / index) and impl-table copies
are excellent for debugging incremental backup issues, but they will produce a lot of NOTICE-level chatter in environments that perform many consistent copy operations.
If this code path can be hot in production, consider either:
- Lowering some of these to DEBUG level, or
- Guarding the more verbose per-child logs behind a feature flag / dynamic log-component level.
Also applies to: 221-247, 263-267
268-284: Keying IndexImplTableCdcStreams only by impl-table leaf name may limit future flexibility
IndexImplTableCdcStreamsis keyed bysrcImplTableName(the impl table’s leaf name). This works today because:
- Each index impl table under a given main table uses the same leaf name (e.g.
indexImplTable), and- You currently configure identical CDC streams for all such impl tables.
However, if you ever need per-index CDC customization (different stream names or settings per index), this map will collapse multiple indexes to the same entry and you’ll lose the ability to distinguish them. In that case you’d likely want to key either by:
- The full path under the main table (e.g.
idxName/implTableName), or- The index path id.
No change is needed now, but it’s worth keeping in mind if the CDC configuration becomes index-specific.
ydb/core/tx/schemeshard/schemeshard__operation_backup_backup_collection.cpp (1)
211-259: Index impl table CDC diagnostics and PQ-part creation are consistent with main-table flowThe added diagnostics after CDC creation (logging AlterVersions for main tables, indexes, and index impl tables) and the PQ-part creation for index impl table CDC streams mirror the existing behavior for main tables:
- StreamImpl is created up-front (main and index impl tables),
- AtTable is driven via
CreateConsistentCopyTablesusing the stored CDC descriptors, and- PQ parts are created afterwards using per-table partition boundaries.
This symmetry makes debugging schema-version issues around indexes much easier and seems logically correct.
One minor optional improvement would be to reuse the stored
TCreateCdcStreamfromIndexImplTableCdcStreamswhen building PQ parts instead of reconstructing a fresh struct with the same fields, to keep all CDC configuration for a given table in one place.Also applies to: 303-360
ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp (5)
238-261: Index impl tables added to finalize target list look functionally correct but stringly-typedThe new loop over
Self->PathsByIdto addindexImplTablepaths inCollectTargetTablePathscorrectly filters byEPathStateIncomingIncrementalRestoreand by parent table prefix, so it should only pull in index impl tables that are part of this restore. However, matching viapathString.Contains("/indexImplTable")andStartsWith(tablePath + "/")is fairly brittle: it relies on the naming convention and flat strings instead of the existing structural metadata (e.g., index path type + parent-child relationships).If you want to harden this later, consider:
- Using
TPath/TPathElementto check that the path is indeed an index impl table (e.g., parent isEPathTypeTableIndex, child is table) instead of relying on the substring.- Optionally precomputing the relevant index impl table
PathIds from theIndexesmap or from the restore operations, then converting to strings, rather than scanning allPathsByIdfor every finalize.Functionally this is fine; the suggestion is about robustness and avoiding accidental matches if naming ever changes.
626-648:FindTargetTablePathsuffix matching is simple but can be ambiguous in deeply nested layouts
FindTargetTablePathmaps an accumulated relative path to a full table path using:if (itemPath == relativeTablePath || itemPath.EndsWith("/" + relativeTablePath))This works for the common case where
relativeTablePathis the DB-relative suffix (e.g.dir/table). However, in pathological layouts with overlapping suffixes (e.g./Root/A/B/tableand/Root/C/A/B/table) this could pick the “wrong” table depending on entry order, since both end with/A/B/table.If such layouts are possible, consider:
- Comparing against the precomputed DB-relative part you already derive via
TrySplitPathByDbinstead of rawitemPath, or- Normalizing both sides into split components and comparing vectors rather than string suffixes.
If the schema guarantees uniqueness of these suffixes, this is fine; otherwise it’s a subtle corner case worth double-checking.
650-696: Recursive index discovery assumes metadata tree structure; add defensive checks if structure can vary
DiscoverIndexesRecursivewalks the__ydb_backup_meta/indexessubtree and:
- Treats
accumulatedRelativePathas a table-relative path and callsFindTargetTablePathon it.- On match, assumes
currentPath.Base()->GetChildren()are index names.- Otherwise, recurses into all children, extending
accumulatedRelativePath.This is correct if the metadata tree mirrors DB-relative table paths and the per-table node directly contains index nodes. If future metadata gains extra layers (e.g. other markers under the table node),
CreateSingleIndexRestoreOperationmight get invoked on non-index children.If the layout isn’t strictly guaranteed, it would be safer to:
- Filter children by expected type or naming before calling
CreateSingleIndexRestoreOperation, or- Encode index metadata (e.g. a marker file node) and filter on that instead of blindly iterating all children.
Right now this is structurally sound but a bit optimistic about the metadata tree shape.
698-739: Index metadata discovery is tolerant, but lack of index meta is silently treated as “no indexes”
DiscoverAndCreateIndexRestoreOperations:
- Respects
OmitIndexes(good).- Resolves
{bcPath}/{backupName}_incremental/__ydb_backup_meta/indexesand, if missing, logs and returns.This means a mismatch between
OmitIndexes=falseand missingindexesdir is treated equivalently to “no indexes were backed up”. That’s likely fine, but if you want stricter observability you could:
- Log at
LOG_WwhenomitIndexes == falsebutindexMetaPathis unresolved, or- Optionally track a counter/flag in
TIncrementalRestoreStatethat “indexes metadata missing” for diagnosing misconfigured backups.Behavior-wise, this is safe; suggestion is only for diagnosability.
741-875:CreateSingleIndexRestoreOperationoverall looks solid; only minor invariants worth calling outThe function correctly:
- Validates the target table resolves and is a table.
- Locates the index by name under the table, verifies it’s a
TableIndexandEIndexTypeGlobal.- Enforces a single-child impl table, and gracefully skips unusual structures.
- Builds a per-index
ESchemeOpRestoreMultipleIncrementalBackupsop, wires it intoIncrementalRestoreOperationToStateandTxIdToIncrementalRestore, and populatesExpectedShards/InvolvedShardsfrom the impl table’s partitions.Minor points to consider:
Assumption that every
GetChildren()-listed pathId is present inPathsByIdandTables
PathsById.at(childPathId)andTables.at(indexImplTablePathId)willY_ABORTif metadata is temporarily inconsistent (e.g., index in the middle of drop/create).- If incremental restore can overlap with DDL on the same table/index, you may want
FindPtr/containschecks and a warning-based early return instead of assuming perfect consistency.Handling missing table info for impl table
- When
TableslacksindexImplTablePathId, you still create and send the restore request but never populateExpectedShardsfor that op.- Depending on how
AllShardsComplete()is implemented, this could either treat the op as “already complete” (0 expected shards) or never complete if responses arrive.- If such inconsistent states are possible, consider bailing out before creating the op when
Tableshas no entry for the impl table, or at least logging atLOG_Wand not adding the op toInProgressOperations.Source path construction assumes
{indexes}/{relativeTablePath}/{indexName}
- This matches the new discovery code; if metadata layout ever changes, both must be updated in lockstep.
Functionally, this looks correct for the intended invariants; the above is mostly about making failure modes clearer and avoiding hard aborts.
ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp (3)
115-173:SyncImplTableVersionnow persists and publishes versions; check for double-persist and invariantsThe extended
SyncImplTableVersion:
- Derives
targetVersionasmax(parentTable.AlterVersion, index.AlterVersion)(when index exists).- Adjusts the impl table’s
AlterVersiontotargetVersion(or+1if already ahead).- Persists the new version via
PersistTableAlterVersionand publishes to SchemeBoard.This is appropriate when invoked from
UpdateTableVersionafter an actual schema operation (e.g., CDC-related alter) has already gone through datashards.Two things to double-check:
Double persistence with
TProposeAtTable::HandleReply
- After
UpdateTableVersionreturns,HandleReplystill callsPersistTableAlterVersion(db, pathId, table)unconditionally.- For impl tables, that’s harmless but redundant. If
PersistTableAlterVersiondoes any heavier work (indexes, counters), you might consider guarding the second call or documenting the intentional duplication.Impl table ahead of parent
- In the
currentImplVersion > targetVersionbranch, you increment impl version again, so impl stays ahead of parent/index.- That maintains monotonicity but violates “impl version >= parent/index version” if you wanted strict alignment.
- If strict alignment is a requirement, you may want to set
table->AlterVersion = currentImplVersionand then sync index entity to that, instead of bumping further.Semantically this looks safe, but verifying expectations around parent vs impl version ordering would be good.
175-248:UpdateTableVersionlogic is sound but may callSyncChildIndexestwice for continuous backup streams
UpdateTableVersionnow distinguishes two cases:
- Index impl table that is part of a continuous backup: calls
SyncImplTableVersion, thenSyncIndexEntityVersion, thenSyncChildIndexeson the grandparent (main table).- All other cases: increments the table’s
AlterVersion, and if the table has a continuous backup stream (HasParentContinuousBackup(versionCtx.PathId, context)), callsSyncChildIndexesto align child indexes.This is consistent with the intent: keep impl tables, index entities, and sibling indexes in sync when continuous backup is active.
However, in
TProposeAtTable::HandleReplyyou also have:if (versionCtx.IsContinuousBackupStream && !versionCtx.IsIndexImplTable) { NCdcStreamState::SyncChildIndexes(path, table->AlterVersion, OperationId, context, db); }For operations on a continuous backup stream:
UpdateTableVersion’s “main table with continuous backup” path will callSyncChildIndexesonce.- The
IsContinuousBackupStreambranch above will call it a second time with the sametargetVersion.Because
SyncIndexEntityVersionis monotonic (if (targetVersion > oldIndexVersion)), the second pass is effectively a no-op but still iterates all children and logs.If you want to avoid the duplicate work, you can either:
- Drop the
HasParentContinuousBackupbranch for the “main table with continuous backup” insideUpdateTableVersionand rely on the explicitIsContinuousBackupStreambranch, or- Add a flag to the
TTableVersionContextindicating that child indexes were already synced duringUpdateTableVersion.Functionally correct as is; this is a small efficiency/clarity nit.
318-368:SyncChildIndexescorrectly avoids impl-table bumps; assumes parentPath has all child indexes loadedThe new
SyncChildIndexes:
- Iterates
parentPath->GetChildren(), filters toIsTableIndex()and notDropped().- Uses
SyncIndexEntityVersionto bump index entity versions totargetVersion.- Explicitly does not touch index impl table versions (with a clear NOTE), to avoid
SCHEME_CHANGEDerrors from bumping versions without a datashard schema TX.- Logs entry/exit and per-index processing.
This matches the desired semantics: keep index entities in step with their parent’s
AlterVersion, while leaving impl tables to be synchronized by actual schema operations (SyncImplTableVersion/alter flows).Only assumption is that
parentPath->GetChildren()andPathsById[childPathId]are always consistent. If there’s any chance of transient inconsistency (e.g. partially dropped index), consider guarding thePathsById.at(childPathId)with acontainscheck to avoid hard aborts.ydb/core/tx/schemeshard/schemeshard__operation_incremental_restore_finalize.cpp (3)
40-173:TConfigurePartscorrectly prepares per-shard ALTERs for index impl tables; minor robustness nits
TConfigureParts::ProgressState:
- Collects target index impl tables via
CollectIndexImplTables(finalize, ...).- If none are found, logs and returns
true, making ConfigureParts a no-op.- For each impl table, ensures
AlterDataexists (creating minimalTAlterTableInfowhen necessary) and bumpsAlterData->AlterVersionby 1.- Adds all partitions’ shard indices to
txState->ShardsasConfigurePartsoperations and persiststxState.- For each shard, finds the owning impl table, builds an
ALTER TABLEbody, and proposes it to the corresponding datashard, then updatesShardsInProgress.Overall this is a solid pattern and mirrors other “ConfigureParts” flows in Schemeshard.
Minor improvements to consider:
Shard-to-table mapping loop
- For every shard you linearly scan all
implTablesToUpdateand each table’s partitions to findtablePathId. For many tables/shards this is O(N²).- If this state can be large, consider precomputing a
THashMap<TShardIdx, TPathId>once before the shard loop.Table existence checks
- You correctly guard
Tables.contains(tablePathId)and log if missing, but later in the shard loop you callcontext.SS->Tables.at(pathId)without re-check.- Today
implTablesToUpdateonly contains IDs that passed the earlierTables.containscheck, but if tables can disappear between these stages, consider either:
- Building and using a local map of
implTablePathId -> TTableInfo::TPtrwhile collecting, or- Re-guarding with
containsin the shard loop to avoidY_ABORT.Functionally, this should work fine for typical sizes; the suggestions are about scaling and robustness.
175-203:CollectIndexImplTablesshares assumptions with scan-side collection; looks consistent
CollectIndexImplTables:
- Filters
finalize.GetTargetTablePaths()byContains("/indexImplTable").- Resolves the path, checks it’s a
Tableand present inTables, and then inserts itsPathIdintoimplTables.This matches how
CollectTargetTablePathsin the scan code constructs target paths for index impl tables, so the two sides are consistent. If the naming convention for impl tables ever changes, both places will need to be updated together; you might eventually want a shared helper for “is index impl table path” to avoid divergence.
271-338:SyncIndexSchemaVersionsfinalizes impl-table alters and bumps index entities; consider reusing CDC helpers
SyncIndexSchemaVersions:
- Iterates
finalize.TargetTablePaths, filtering to paths containing/indexImplTable.- For each resolved impl table with non-null
AlterData:
- Calls
FinishAlter, persists viaPersistTableAltered, clears caches, and publishes.- Locates the parent index path and unconditionally
++s itsAlterVersion, then persists and publishes.The flow ensures that:
- Impl tables, for which
ConfigurePartspreparedAlterData->AlterVersion = old + 1, actually commit that version bump.- Corresponding index entities also advance their
AlterVersion, keeping metadata in sync.Two points worth a closer look:
Unconditional index
AlterVersionincrement
- This always does
oldVersion + 1regardless of the impl-table’s new version.- Elsewhere (
NCdcStreamState::SyncIndexEntityVersion), you sync to a target version and avoid regressions.- If other operations may have bumped the index version between ConfigureParts and Finalize, you might end up with index
AlterVersion!= tableAlterVersion.- To keep semantics consistent, consider using
SyncIndexEntityVersion(indexPathId, table->AlterVersion, OperationId, context, db)here instead of manually incrementing.Missing
AlterDataon impl table
- If for some reason an impl table reaches finalization without
AlterData, you log a warning and skip it.- That leaves its schema version unbumped while its index entity might still get bumped (if handled elsewhere).
- If that’s an unexpected state, you may want to either:
- Treat it as a hard error (or at least warn at a higher log level), or
- Recompute a minimal
AlterDatasimilar toTConfigurePartsinstead of skipping.No blockers here; the behavior is consistent with the new ConfigureParts stage, but reusing the CDC sync helpers would tighten invariants and reuse the “no downgrade” logic.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md(1 hunks)INCREMENTAL_BACKUP_INDEX_PROGRESS.md(1 hunks)VERSION_SYNC_PLAN.md(1 hunks)ydb/core/protos/flat_scheme_op.proto(2 hunks)ydb/core/tx/datashard/datashard_ut_common_kqp.h(2 hunks)ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp(3 hunks)ydb/core/tx/datashard/ut_common/datashard_ut_common.cpp(1 hunks)ydb/core/tx/datashard/ut_common/datashard_ut_common.h(2 hunks)ydb/core/tx/schemeshard/schemeshard__operation_backup_backup_collection.cpp(4 hunks)ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp(7 hunks)ydb/core/tx/schemeshard/schemeshard__operation_consistent_copy_tables.cpp(4 hunks)ydb/core/tx/schemeshard/schemeshard__operation_copy_table.cpp(4 hunks)ydb/core/tx/schemeshard/schemeshard__operation_create_restore_incremental_backup.cpp(2 hunks)ydb/core/tx/schemeshard/schemeshard__operation_incremental_restore_finalize.cpp(5 hunks)ydb/core/tx/schemeshard/schemeshard_cdc_stream_common.h(2 hunks)ydb/core/tx/schemeshard/schemeshard_impl.h(1 hunks)ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp(2 hunks)ydb/core/util/source_location.h(1 hunks)ydb/library/ut/ut.h(1 hunks)
🔇 Additional comments (24)
ydb/core/util/source_location.h (1)
34-36: Dummyfunction_name()looks correctThe fallback
TSourceLocation::function_name()mirrors the std::source_location API and unblocks callers likeFormatLocation()without changing behavior (empty string in dummy mode). No issues here.VERSION_SYNC_PLAN.md (1)
1-288: Helpful internal design/diagnostic planThis document clearly captures the schema-version mismatch investigation and possible fixes; since it’s non-code, it’s safe to land as-is.
INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md (1)
1-632: Design summary is thorough and non-intrusiveThe incremental-backup/index summary is detailed and aligns with the surrounding code, while remaining purely informational. Safe to keep as project documentation.
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (6)
3091-3161: ImprovedOmitIndexesIncrementalBackupassertions look goodSwitching to
FindIncrementalBackupDirplus DescribePath:
- Ensures the incremental backup table actually exists and has
__ydb_incrBackupImpl_changeMetadata.- Explicitly verifies that
__ydb_backup_metais absent whenOMIT_INDEXES = 'true'.This makes the test much more robust against layout changes and aligns with the intended semantics of
OMIT_INDEXES.
3162-3274:BasicIndexIncrementalRestorecorrectly validates single-index restoreThe test covers:
- Table with one global index.
- Full backup + incremental backup (insert + update).
- Data equality after restore and correct index query behavior.
- Direct verification of the index implementation table contents and row count.
The expectations (4 final index rows, including updated value) match the described operations. Looks solid.
3276-3395:MultipleIndexesIncrementalRestoregives good multi-index coverageThis scenario exercises three global indexes on one table, with an incremental backup adding a new row. After restore, it verifies:
- Base table data.
- Each index via
VIEWqueries.- Each index impl table’s row count and sample contents.
The checks line up with the mutation pattern (4 rows total) and should catch regressions in multi-index restore logic.
3397-3509:IndexDataVerificationIncrementalRestorethoroughly checks index stateThis test is valuable because it:
- Uses a multi-shard table with a global index.
- Applies mixed updates, inserts, and deletes before the incremental backup.
- Compares index query results before/after restore and inspects the index impl table contents and row count.
The asserted presence/absence of specific rows (Alice/Charlie vs Eve/Bob/David/Frank) matches the sequence of operations and should strongly guard index correctness across shards.
3511-3632:MultipleIncrementalBackupsWithIndexesvalidates multi-step historyThis test exercises three incremental backups over a table with a global index, mixing inserts, updates, and deletes, then:
- Compares final table and index view results before/after restore.
- Asserts that obsolete index entries (100, 200) are gone and only the final (250, 300, 400) remain.
- Checks index impl row count.
The expectations align with the described mutations and will catch regressions in replaying multiple incremental backups with indexes.
3634-3765:MultipleTablesWithIndexesIncrementalRestorecovers multi-table/index restoreFor two tables each with their own global index, this test:
- Runs full + incremental backups.
- Restores and verifies both tables’ data.
- Validates index queries on each table.
- Checks that both index impl tables have the expected row counts and sample data.
This is a good cross-table coverage addition and matches the mutation pattern (3 rows per table).
ydb/library/ut/ut.h (2)
8-38: Test context abstraction is straightforward
NUnitTest::TTestContextBaseandTTestContextcleanly wrap source-location into a format string, and now rely onTSourceLocation::function_name()which is available in both real and dummy implementations. No issues here.
41-91: Fix concepts usage andtestCtxdetection in CTX macrosTwo related issues in
CTX_UNIT_FAIL_IMPL:
Missing
<concepts>includeThe macro uses
std::derived_fromandstd::convertible_to, which are declared in<concepts>. Relying on transitive includes is brittle and may fail in some TUs.
testCtxreferenced outside arequiresexpressionThe condition:
if constexpr (requires { testCtx; } && std::derived_from<decltype(testCtx), ::NUnitTest::TTestContextBase> && requires { { testCtx.Format() } -> std::convertible_to<TString>; })evaluates
std::derived_from<decltype(testCtx), ...>in a normal constant-expression context. In any use-site wheretestCtxis not declared,decltype(testCtx)is ill-formed and the macro will fail to compile, instead of cleanly falling back to__LOCATION__/__PRETTY_FUNCTION__as intended.You can fix both by:
- Including
<concepts>.- Moving all uses of
testCtxinto a singlerequires-expression so the check is SFINAE-friendly.A concrete fix could look like:
-#include <library/cpp/testing/unittest/registar.h> - -#include <ydb/core/util/source_location.h> +#include <library/cpp/testing/unittest/registar.h> +#include <ydb/core/util/source_location.h> +#include <concepts> @@ -#define CTX_UNIT_FAIL_IMPL(R, M) \ - do { \ - ::TStringBuilder locationInfo; \ - if constexpr (requires { testCtx; } && \ - std::derived_from<decltype(testCtx), ::NUnitTest::TTestContextBase> && \ - requires { { testCtx.Format() } -> std::convertible_to<TString>; }) \ - { \ - locationInfo << testCtx.Format(); \ - } else { \ - locationInfo << __LOCATION__ << ", " << __PRETTY_FUNCTION__; \ - } \ - ::NUnitTest::NPrivate::RaiseError(R, ::TStringBuilder() << R << " at " << locationInfo << ": " << M, true); \ - } while (false) +#define CTX_UNIT_FAIL_IMPL(R, M) \ + do { \ + ::TStringBuilder locationInfo; \ + if constexpr (requires { \ + testCtx; \ + requires std::derived_from<decltype(testCtx), ::NUnitTest::TTestContextBase>; \ + { testCtx.Format() } -> std::convertible_to<TString>; \ + }) \ + { \ + locationInfo << testCtx.Format(); \ + } else { \ + locationInfo << __LOCATION__ << ", " << __PRETTY_FUNCTION__; \ + } \ + ::NUnitTest::NPrivate::RaiseError(R, ::TStringBuilder() << R << " at " << locationInfo << ": " << M, true); \ + } while (false)This keeps the behavior the same when a compatible
testCtxis in scope, but ensures that in call sites withouttestCtxtheif constexprcondition simply evaluates to false and the macro falls back to the legacy location info. Please also verify that your toolchain is built with C++20 concepts enabled so<concepts>and these constraints are available.ydb/core/tx/schemeshard/schemeshard__operation_create_restore_incremental_backup.cpp (1)
420-426: Allow restore into index impl/private tables while keeping normal path hygieneConditionally skipping
IsCommonSensePath()whendstTablePath.IsInsideTableIndexPath(false)looks correct for enabling restores into index implementation tables while preserving the existing validation for regular user paths (NotEmpty,IsTable,NotUnderDeleting, etc.). Please double‑check that only internal incremental‑restore flows can target these private paths and that tests cover both cases: destination as a regular table and as an index impl table (created both inside and outside the same op).Also applies to: 566-572
ydb/core/tx/schemeshard/schemeshard__operation_copy_table.cpp (3)
227-231: Persisting source AlterVersion and syncing child indexes for CDC‑backed copiesBumping
srcTable->AlterVersion, persisting it viaPersistTableAlterVersion, and then callingNCdcStreamState::SyncChildIndexeswhenCdcPathIdis set ensures that a CDC‑initiated copy keeps the source table and its child indexes in version sync. This looks correct and nicely centralizes the “update all index entities” step; just confirm thatSyncChildIndexesis safe (no‑op) for tables without child indexes and that it uses the samedbtransaction to avoid partial version updates.
399-405: Relaxing source path “same‑operation” checks via AllowUnderSameOperationConditioning
NotUnderTheSameOperation/NotUnderOperationonTransaction.GetCreateTable().GetAllowUnderSameOperation()allows internal multi‑step flows (like incremental backup/restore or consistent copying) to reuseTCopyTableeven when the source path is already involved in the same top‑level tx, while still enforcing the full safety checks for normal users. This is reasonable, but please make sure that all callers that rely on this relaxation explicitly setAllowUnderSameOperation, and that there are no external/user‑visible code paths where a table is both being altered and copied in the same operation without a well‑defined snapshot.
808-810: Propagating OmitIndexes and skipping impl table copy sub‑operationsPassing
copying.GetOmitIndexes()into the nestedCreateTableschema and then, in the index loop, always creating the index metadata but skipping impl tableCreateTable/CopyTablesub‑operations whenOmitIndexesis true matches the idea that index data will be handled separately (e.g., via CDC‑driven consistent copies). The structure of the loop remains the same for theOmitIndexes == falsecase, so existing CopyTable behavior is preserved; please verify that in theOmitIndexes == trueflows there is a corresponding path that creates/restores index implementation tables before any reads rely on them.Also applies to: 866-869
ydb/core/tx/schemeshard/schemeshard_impl.h (1)
1198-1226: Index-restore helper declarations are consistent with existing incremental-restore APIThe new helpers for index discovery and restore (
DiscoverAndCreateIndexRestoreOperations,DiscoverIndexesRecursive,CreateSingleIndexRestoreOperation,FindTargetTablePath) fit well with the surrounding incremental-restore state and event handlers; signatures and ownership semantics look reasonable.ydb/core/tx/schemeshard/schemeshard__operation_consistent_copy_tables.cpp (1)
23-35: Index handling split between descriptor and inner CreateTable looks correctSetting
operation->SetOmitIndexes(true)inCopyTableTaskwhile still honoringdescr.GetOmitIndexes()inCreateConsistentCopyTablesis a good separation of concerns:
- The descriptor’s
OmitIndexescontinues to control whether indexes are processed at all in the explicit child loop.- The inner
CreateTableOmitIndexes flag preventsCreateCopyTablefrom re-walking children and duplicating index work, which is now handled explicitly (including CDC wiring).This matches the new CDC-aware index flow and shouldn’t affect callers outside
CreateConsistentCopyTables, sinceCopyTableTaskis local to this TU.Also applies to: 134-170
ydb/core/tx/schemeshard/schemeshard__operation_backup_backup_collection.cpp (1)
66-73: Effective omitIndexes flag and main-table CDC setup look correct
omitIndexescombining collection-levelOmitIndexeswith incremental-configOmitIndexesis a sensible “effective” flag for this backup run.- Wiring this into
desc.SetOmitIndexes(omitIndexes)letsCreateConsistentCopyTablesdecide whether to process indexes at all, whileCopyTableTaskensures CreateTable itself doesn’t recurse into them.- For incremental backups, creating a StreamImpl CDC stream on the main table once, and then embedding
CreateSrcCdcStreaminto the descriptor so that the copy phase can perform the AtTable step, matches the intended three-phase CDC strategy.Also applies to: 87-96, 100-132
ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp (1)
612-621: Parallel index restore ops are wired into the global incremental restore state as expectedThe call to
DiscoverAndCreateIndexRestoreOperationsfromCreateIncrementalRestoreOperationuses the sameoperationId,backupName, andbcPathas table restores and wires index operations intoIncrementalRestoreStatesviaIncrementalRestoreOperationToState/TxIdToIncrementalRestoreinCreateSingleIndexRestoreOperation. That keeps index impl restores tracked uniformly with table restores and allows shard progress/finalization to work without special cases.No correctness issues spotted in this integration.
ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp (2)
253-316:SyncIndexEntityVersionwrapper looks correct; monotonic update prevents version regressionsThe new public-style
SyncIndexEntityVersion:
- Checks
Indexes.contains(indexPathId)and early-returns if the index doesn’t exist.- Only updates
AlterVersioniftargetVersion > oldIndexVersion, preventing downgrades during races.- Persists via
PersistTableIndexAlterVersion, clears caches, and publishes to SchemeBoard with the suppliedoperationId.This is the right behavior for callers like
UpdateTableVersionand index-finalization logic that need to “catch up” index metadata to a table/impl version without risking rollback.No issues spotted; just ensure the declaration in
schemeshard_cdc_stream_common.hmatches this signature exactly (includingNIceDb::TNiceDb&).
466-471: Version update + SyncChildIndexes sequencing inTProposeAtTable::HandleReplyIn
TProposeAtTable::HandleReplyyou now:
- Build
versionCtxand callUpdateTableVersion(versionCtx, table, OperationId, context, db).- If
versionCtx.IsContinuousBackupStream && !versionCtx.IsIndexImplTable, callSyncChildIndexeswith the newtable->AlterVersion.- Persist table alter version, clear caches, publish, update tx state.
This ordering is sensible: the parent table’s
AlterVersionis finalized before child index entities are synced, and all metadata is persisted and published afterwards.Given that
UpdateTableVersionalready covers some continuous-backup paths (main table and impl tables), the explicitIsContinuousBackupStreamcheck here primarily covers the CDC-stream-specific case. The only caveat is the potential doubleSyncChildIndexesdiscussed earlier; apart from that, the flow looks good.ydb/core/tx/schemeshard/schemeshard__operation_incremental_restore_finalize.cpp (3)
13-38: State machine extension toConfigurePartsis coherentChanging
NextStateto:
Waiting -> ConfigurePartsConfigureParts -> ProposePropose -> Doneand mapping both
WaitingandConfigurePartstoTConfigurePartsinSelectStateFuncmakes sense—the first entry into the operation runsConfigureParts, and replays from persistedWaitingstate will also land there.This nicely inserts an index-related ALTER stage before finalization without disturbing the rest of the lifecycle.
224-268: CallingSyncIndexSchemaVersionsbefore releasing path states is the right placeIn
TFinalizationPropose::ProgressState, you now:
- Retrieve the finalize proto.
- Call
SyncIndexSchemaVersions(finalize, context)to finalize impl-table alters and bump index entity versions.- Only then release affected path states and perform cleanup.
This ordering ensures:
- Data shards have applied the
ALTERfrom theConfigurePartsstage.- Schemeshard’s in-memory and persisted metadata (impl tables + indexes) are consistent before paths transition out of
EPathStateIncomingIncrementalRestore.That sequencing looks correct and avoids exposing “half-bumped” schema versions to clients.
374-394: Final cleanup respects shard progress and state; no issues
PerformFinalCleanup:
- Marks the incremental restore state as
Completed(without immediately deleting the in-memory entry).- Erases the corresponding entry from
LongIncrementalRestoreOps.- Calls
CleanupMappingsto remove allTxIdToIncrementalRestoreandIncrementalRestoreOperationToStatemappings for thisoperationId.Combined with earlier deletion of
IncrementalRestoreShardProgressandIncrementalRestoreStaterows, this leaves no dangling bookkeeping for the restore while still allowing high-level operation listings to reflect completion as intended.Looks consistent with the rest of the incremental restore machinery.
| // Create CDC StreamImpl for index implementation tables (before copying starts) | ||
| // Store CDC info in the descriptor so CreateConsistentCopyTables can create AtTable and PQ parts | ||
| // Only do this for incremental backups | ||
| if (incrBackupEnabled && !omitIndexes) { | ||
| const auto tablePath = sPath; | ||
|
|
||
| // Iterate through table's children to find indexes | ||
| for (const auto& [childName, childPathId] : tablePath.Base()->GetChildren()) { | ||
| auto childPath = context.SS->PathsById.at(childPathId); | ||
|
|
||
| // Skip non-index children (CDC streams, etc.) | ||
| if (childPath->PathType != NKikimrSchemeOp::EPathTypeTableIndex) { | ||
| continue; | ||
| } | ||
|
|
||
| // Skip deleted indexes | ||
| if (childPath->Dropped()) { | ||
| continue; | ||
| } | ||
|
|
||
| // Get index info and filter for global sync only | ||
| auto indexInfo = context.SS->Indexes.at(childPathId); | ||
| if (indexInfo->Type != NKikimrSchemeOp::EIndexTypeGlobal) { | ||
| continue; | ||
| } | ||
|
|
||
| // Get index implementation table (the only child of index) | ||
| auto indexPath = TPath::Init(childPathId, context.SS); | ||
| Y_ABORT_UNLESS(indexPath.Base()->GetChildren().size() == 1); | ||
| auto [implTableName, implTablePathId] = *indexPath.Base()->GetChildren().begin(); | ||
|
|
||
| auto indexTablePath = indexPath.Child(implTableName); | ||
|
|
||
| // Create CDC stream on index impl table (before copying starts) | ||
| NKikimrSchemeOp::TCreateCdcStream indexCdcStreamOp; | ||
| // Set table name to just the name since we're passing the full path as tablePath parameter | ||
| indexCdcStreamOp.SetTableName(implTableName); | ||
| auto& indexStreamDescription = *indexCdcStreamOp.MutableStreamDescription(); | ||
| indexStreamDescription.SetName(streamName); | ||
| indexStreamDescription.SetMode(NKikimrSchemeOp::ECdcStreamModeUpdate); | ||
| indexStreamDescription.SetFormat(NKikimrSchemeOp::ECdcStreamFormatProto); | ||
|
|
||
| NCdc::DoCreateStreamImpl(result, indexCdcStreamOp, opId, indexTablePath, false, false); | ||
|
|
||
| // Store CDC stream info in the descriptor's map | ||
| // Key is the impl table name, value is the CDC stream config | ||
| (*desc.MutableIndexImplTableCdcStreams())[implTableName].CopyFrom(indexCdcStreamOp); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inverted condition when invalidating caches for index impl tables
In the “Also invalidate cache for index impl tables” block:
for (const auto& [childName, childPathId] : sPath.Base()->GetChildren()) {
auto childPath = context.SS->PathsById.at(childPathId);
if (childPath->PathType != NKikimrSchemeOp::EPathTypeTableIndex && !childPath->Dropped()) {
auto indexInfo = context.SS->Indexes.find(childPathId);
...the childPath->PathType != EPathTypeTableIndex check prevents the body from running for actual index children, so the subsequent lookup in Indexes is effectively dead. To actually invalidate caches for index impl tables, the condition should be the opposite:
- if (childPath->PathType != NKikimrSchemeOp::EPathTypeTableIndex && !childPath->Dropped()) {
+ if (childPath->PathType == NKikimrSchemeOp::EPathTypeTableIndex && !childPath->Dropped()) {Without this fix, describe-path caches for index impl tables won’t be cleared or published here, which can delay metadata convergence, even though other paths may eventually refresh them.
Also applies to: 183-203
🤖 Prompt for AI Agents
ydb/core/tx/schemeshard/schemeshard__operation_backup_backup_collection.cpp
around lines 133-181 and 183-203: the loop that should invalidate caches for
index implementation tables uses an inverted condition (checks
childPath->PathType != EPathTypeTableIndex), which skips real index children and
makes the subsequent Indexes lookup dead; change the condition to require
childPath->PathType == NKikimrSchemeOp::EPathTypeTableIndex and ensure you still
skip dropped children (childPath->Dropped()), then proceed to find the index in
context.SS->Indexes by childPathId, get the implementation table child, and
invalidate/publish the describe-path cache for that impl table as originally
intended.
010417d to
f3e9a0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
INCREMENTAL_BACKUP_INDEX_PROGRESS.md (1)
1-260: This appears to be intermediate investigation documentation—clarify purpose and consolidate with related files.This file thoroughly documents the investigation journey, but it reads as working notes rather than finalized documentation. It includes failed attempts (#1 and #2) and indicates Phase 2 analysis is "in progress." Several points need clarification:
- Are the issues documented here (race conditions, version mismatches) actually fixed in this PR's implementation, or are they still pending?
- The "Next Steps (To Be Implemented)" section on line 75 proposes version synchronization—has this been implemented in the code, or is it future work?
- This file significantly overlaps with
VERSION_SYNC_PLAN.mdandINCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md(all cover similar root causes and solution attempts).Recommendations:
- Consolidate these three investigation documents into a single
IMPLEMENTATION_STATUS.mdorINVESTIGATION_NOTES.mdwith clear sections for: Problem → Root Cause → Attempted Solutions → Final Resolution → Remaining Issues.- Add explicit status indicators (✅ Fixed / ⏳ Pending / ❌ Blocked) for each issue.
- If this PR includes fully working implementation, move detailed failed attempts to a separate
INVESTIGATION_HISTORY.mdto keep the main documentation focused on what was delivered.- Add language specifiers to code blocks (lines 5, 66, 110, 117, 123, 153) to resolve markdown linting warnings.
VERSION_SYNC_PLAN.md (1)
1-280: Clarify whether this is a research plan that was executed or a proposal for future work.This document presents a detailed 6-phase research and fix plan, but it's unclear whether these steps were actually executed or if they're intended as a roadmap. Key concerns:
- Phase Execution Status: Phases 1-3 outline diagnostic logging and code search tasks. Were these completed? If so, include the findings.
- Phase 5 Options: Three implementation approaches (A: Update Index Metadata, B: Increment Main Table Version, C: Send Explicit Version Update) are presented, but there's no indication of which was selected or why.
- Document Scope: This overlaps significantly with
INCREMENTAL_BACKUP_INDEX_PROGRESS.md(both analyze schema version mismatch) andINCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md(both propose solutions).Recommendations:
- If this plan was executed: Update the document to summarize actual findings from each phase rather than presenting hypothetical investigation steps.
- If this is future work: Move it out of the PR and into an issue/planning document; the final PR should document what was implemented, not what might be done.
- Add language specifiers to all code blocks (e.g.,
```bash,```cpp) to resolve markdown linting errors.- Consolidate with related investigation files to avoid maintaining three parallel documents of similar content.
INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md (2)
420-512: Consolidate detailed technical summary with other investigation files; consider moving to single implementation summary document.Lines 420-512 provide an excellent "How It Works" walkthrough of the three-phase CDC creation flow. However, similar explanations appear in other files. Key redundancies:
- Problem description (lines 3-17): Also in INCREMENTAL_BACKUP_INDEX_PROGRESS.md and VERSION_SYNC_PLAN.md
- Root cause analysis (lines 18-29): Overlaps with other files
- Solution approaches and failures (lines 30-150): Similar coverage in INCREMENTAL_BACKUP_INDEX_PROGRESS.md
Recommendations:
- Create a single
IMPLEMENTATION_COMPLETE.mdorDESIGN_SUMMARY.mdthat consolidates:
- Final problem statement
- Root cause (single version, not repeated across 3 files)
- Actual implementation approach that was chosen
- How it works (the excellent walkthrough in lines 420-512)
- Verification results
- Move detailed failure analysis and investigation steps to optional
INVESTIGATION_HISTORY.mdfor maintainer reference- This keeps active documentation focused and reduces maintenance burden.
485-485: Fix markdown linting issues: add language specifiers and resolve duplicate headings.Several markdown linting warnings need resolution:
- Missing language specifiers on code blocks: Lines with
```should specify language (e.g.,```cpp,```protobuf,```bash). Affects lines with code examples throughout.- Duplicate heading "Testing": Appears at lines 485 and 563—consolidate or rename one.
- Duplicate heading "Status" (implicit): Line 548 has heading, and status is scattered through document.
Fix:
- Add language identifiers to all code blocks
- Rename second "Testing" section to "Test Case Details" or "Test Scenarios"
- Consolidate "Current Status" section (line 548) with the overall "Status" context
Also applies to: 513-513, 548-548, 563-563
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md(1 hunks)INCREMENTAL_BACKUP_INDEX_PROGRESS.md(1 hunks)VERSION_SYNC_PLAN.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
INCREMENTAL_BACKUP_INDEX_PROGRESS.md
[style] ~101-~101: Consider a different adjective to strengthen your wording.
Context: ...ion - Phase 2 Analysis In Progress: Deep investigation of version synchronizatio...
(DEEP_PROFOUND)
[style] ~242-~242: Consider using “incomplete” to avoid wordiness.
Context: ...g**: SyncChildIndexes runs async; maybe not complete before query? 2. Code path: Maybe S...
(NOT_ABLE_PREMIUM)
[style] ~243-~243: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...omplete before query? 2. Code path: Maybe SyncChildIndexes isn't being called for...
(REP_MAYBE)
[style] ~244-~244: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...d for backup CDC streams? 3. Cache: Maybe KQP's cache isn't being invalidated for...
(REP_MAYBE)
[style] ~245-~245: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...ted for the index? 4. Wrong target: Maybe sync is updating wrong index or wrong v...
(REP_MAYBE)
🪛 markdownlint-cli2 (0.18.1)
VERSION_SYNC_PLAN.md
5-5: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
66-66: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
110-110: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
117-117: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
123-123: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
153-153: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
INCREMENTAL_BACKUP_INDEX_PROGRESS.md
5-5: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
66-66: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
110-110: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
117-117: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
123-123: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
153-153: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md
7-7: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
45-45: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
73-73: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
169-169: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
180-180: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
513-513: Multiple headings with the same content
(MD024, no-duplicate-heading)
548-548: Multiple headings with the same content
(MD024, no-duplicate-heading)
563-563: Multiple headings with the same content
(MD024, no-duplicate-heading)
🔇 Additional comments (1)
INCREMENTAL_BACKUP_INDEX_FIX_SUMMARY.md (1)
1-50:⚠️ CRITICAL: Documentation claims race condition is unfixed and Approach 3 is not implemented—verification confirms this is accurate but contradicts PR intent.The document accurately reflects the current codebase state. Verification of the implementation shows:
schemeshard__operation_incremental_restore_finalize.cpp: Only uses asynchronousPublishToSchemeBoard()(fire-and-forget), noPublishAndWaitPublication()orTEvCompletePublicationhandlers present.- Barrier mechanism exists elsewhere: Other operations (
TAlterExtSubDomain,TDropIndexedTable,TDropTable,TMoveTableIndex,TMoveTable,TMoveSequence) successfully use thePublishAndWaitPublication()barrier pattern with properTEvCompletePublicationhandlers and wait logic.- Approach 3 is genuinely not implemented: No
RegisterWaitPublication(), noCountWaitPublication()checks, no barrier state logic in the incremental restore finalize operation.This means:
- The race condition (lines 1-40) described in the document remains unresolved in this PR.
- Document accurately reflects reality—it is not outdated.
- Either the PR is incomplete and Approach 3 implementation is missing, or the PR objectives have changed.
Clarification needed: Is this PR intended to deliver only the analysis and recommendations without the actual fix implementation? If the fix is expected to be included,
TFinalizationProposerequires the same barrier pattern that already works in other operations: addPublishAndWaitPublication()calls, implementHandleReply(TEvCompletePublication::TPtr&), and checkCountWaitPublication()inProgressState().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (14)
CDC_VERSION_SYNC_RESEARCH_SUMMARY.md (2)
164-164: Fix grammar: missing hyphen in "operation structure" phrase.Line 164 mentions "More complex operatio" (cut off in static analysis hint). This appears to be a markdown formatting issue where text wrapped awkwardly. Review the full context around line 164 to ensure proper phrasing and clarity.
255-255: Add missing comma after year in date.Line 255 should read:
**Date:** January 20, 2025, **Status:**(or restructure) per standard date formatting in formal documents.DATASHARD_VERSION_VALIDATION_ANALYSIS.md (3)
20-44: Excellent code analysis, but add language identifiers to code blocks for clarity.Lines 20-44 and throughout the document contain bare code fences without language identifiers. Change:
Y_VERIFY_DEBUG_S(...)to:
Y_VERIFY_DEBUG_S(...)This is a minor markdown hygiene issue that improves readability in documentation systems. The static analysis tool flagged this at multiple locations (lines 155, 226, 235, 367, etc.).
175-210: Query version check analysis is clear but could address one edge case.The analysis of query execution (section 3) correctly identifies that queries use SchemeBoard, not direct datashard versions. However, consider adding:
- Behavior when SchemeBoard cache is stale (e.g., new node joining cluster)
- Whether version checks are transactional or can change mid-query
These edge cases don't seem to affect the "converge later" safety argument, but explicitly addressing them would strengthen the analysis.
270-301: Test coverage analysis identifies real gaps, but recommendations could be more specific.Section 5.3 recommends adding tests for "version consistency validation after CDC completes" and "concurrent queries during CDC operation". These are good, but consider specifying:
- How to trigger concurrent queries (query thread vs. CDC thread timing)
- What specific assertions to check (exact version numbers, not just "no errors")
- Whether existing test framework supports this (or needs new instrumentation)
Without these specifics, the testing recommendations may be harder for developers to implement.
cdc_version_sync_design.md (3)
1-100: Well-structured design document with clear problem statement and executive summary.The document systematically presents the race condition problem and evaluates 5 strategies. Strengths:
- Problem statement with concrete timeline showing race (lines 29-38)
- Clear explanation of why current sync fails (lines 241-281)
- Five distinct strategies with pros/cons
Consider: Is this document primarily for architects/decision-makers or for implementers? If implementers, consider adding a quick "Start here" section directing readers to the recommended strategy (Strategy E) rather than requiring them to read all 5 strategies first.
302-406: Add language identifiers to all code fences per markdown best practices.Lines 312, 330, 336, 367, 376, etc. contain bare code blocks. Should be:
// Example codeAlso consider: Are these pseudocode or actual C++ that will compile? If pseudocode, consider marking explicitly (e.g.,
// Pseudocode:comment) to set reader expectations.
1000-1099: Recommendation for Strategy E is clear but could explicitly address the "Why not A?" question.Section 7 recommends Strategy E but doesn't explicitly state why Strategy A wasn't chosen. The cost/benefit analysis (section 4) favors E on performance, but the decision section could strengthen this by saying something like:
"Strategy A was rejected not due to safety concerns (both are safe) but due to unnecessary complexity. Strategy E achieves the same correctness with 50% less code and zero additional latency. Strategy A would only be reconsidered if testing reveals N² worst-case writes are a practical problem (not expected with typical tables)."
This would help readers understand the decision was optimization-based, not safety-based.
ss2.md (2)
76-92: Example use of "Step 1-3" headers: Use markdown heading instead of bold emphasis.Lines 76-92 show headers using bold (
**Step 1: ConstructParts Factory**) instead of markdown headings (### Step 1: ConstructParts Factory). This is flagged by static analysis at line 76 (MD036). While both render, using proper headings improves document structure and navigation.
251-281: Parts Vector restoration explanation is clear but could add real-world troubleshooting guidance.The explanation at lines 246-281 clearly shows that Parts vector is NOT restored after crash, only active TTxState entries are restored. This is important for developers to understand.
Consider adding a troubleshooting section: "If your operation restarts after crash and parts don't resume, check that all context is stored in TTxState, not just in part-local memory."
strategy_e_implementation_research.md (4)
1-65: Comprehensive implementation research, but documents needs structure improvements for readability.This is an excellent, detailed implementation guide for Strategy E lock-free helping coordination. The amount of content (1770 lines) is impressive and thorough. However, consider:
- Table of Contents: Add at top for navigation (readers need to know what's covered)
- Quick Start: Add pointer for developers who want to just implement (vs. understand every detail)
- Prerequisites: Section on what readers should know (lock-free algorithms? YDB internals?)
These additions would make the document more usable without reducing its depth.
30-46: Verification section is valuable but could be more explicit about verification methodology.Lines 30-46 show "VERIFICATION STATUS: All file locations, classes, and functions verified" but doesn't explain how verification was done. Was this:
- Manual code inspection?
- Automated grep/ast-grep search?
- Reference to specific commit hashes?
Adding this would help readers assess confidence level and reproduce verification if needed.
238-294: Critical "Why Current Sync Fails" section is excellent for understanding the problem.Lines 240-294 clearly explain the race condition in existing code. The timeline format (T1-T8) makes the problem concrete and visual. This is excellent documentation that should definitely be preserved in final design doc.
However, one improvement: Add a "This bug manifests as:" section explaining what users see (SCHEME_CHANGED errors? Version mismatches in query results?). Currently the document explains the technical issue but not the user-visible symptoms.
1088-1150: Step-by-step integration guide is extremely helpful, but could be more specific about testing.Phase 4 (lines 1164-1202) recommends adding diagnostic tests but leaves specifics to the reader. Consider providing:
- Actual test code template they can fill in
- Specific assertions (not just "verify all at same version")
- How to simulate concurrent CDC operations in test
Without these specifics, developers may write tests that don't actually catch race conditions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CDC_VERSION_SYNC_RESEARCH_SUMMARY.md(1 hunks)DATASHARD_VERSION_VALIDATION_ANALYSIS.md(1 hunks)cdc_version_sync_design.md(1 hunks)ss2.md(1 hunks)strategy_e_implementation_research.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
cdc_version_sync_design.md
[grammar] ~932-~932: Ensure spelling is correct
Context: ...kup ### 6.2 Key Data Structures **TOperation:** File: ydb/core/tx/schemeshard/schemeshard__operation.h Lines: 10-157 struct TOperation { const TTxId TxId; TVectorISubOperation::TPtr Parts; // Operation parts (parallel execution) TSet DoneParts; // Completed parts THashMap<TString, TSet> Barriers; // Barrier name -> blocked parts // ... }; **TTableInfo:** File: ydb/core/tx/sche...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
strategy_e_implementation_research.md
[style] ~92-~92: This phrase is redundant. Consider writing “point” or “time”.
Context: ...perties #### Progress Guarantee At any point in time, at least one operation can make progre...
(MOMENT_IN_TIME)
[grammar] ~1320-~1320: Use a hyphen to join words.
Context: ...siblings, normal helping case 3. Self catching up: ``` [DEBUG] HelpSyncSibli...
(QB_NEW_EN_HYPHEN)
CDC_VERSION_SYNC_RESEARCH_SUMMARY.md
[grammar] ~164-~164: Ensure spelling is correct
Context: ...f concerns -
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[style] ~255-~255: Some style guides suggest that commas should set off the year in a month-day-year date.
Context: ...:** AI Analysis Date: January 20, 2025 Status: ✅ COMPLETE - All findings...
(MISSING_COMMA_AFTER_YEAR)
🪛 markdownlint-cli2 (0.18.1)
DATASHARD_VERSION_VALIDATION_ANALYSIS.md
155-155: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
226-226: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
235-235: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
367-367: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
ss2.md
76-76: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
251-251: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
262-262: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
388-388: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
cdc_version_sync_design.md
24-24: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
30-30: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
67-67: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
89-89: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
114-114: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
126-126: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
152-152: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
203-203: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
217-217: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
228-228: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
235-235: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
250-250: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
265-265: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
286-286: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
312-312: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
561-561: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
833-833: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
877-877: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
888-888: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
901-901: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
915-915: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
924-924: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
933-933: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
947-947: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
957-957: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
967-967: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
983-983: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
988-988: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1004-1004: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1020-1020: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1148-1148: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1158-1158: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1177-1177: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1189-1189: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
strategy_e_implementation_research.md
74-74: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
131-131: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
132-132: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
141-141: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
142-142: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
152-152: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
153-153: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
267-267: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
317-317: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
376-376: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
804-804: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
839-839: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
850-850: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
891-891: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
929-929: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
936-936: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
942-942: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1090-1090: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1104-1104: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1125-1125: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1149-1149: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1164-1164: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1203-1203: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1211-1211: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1302-1302: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1311-1311: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1321-1321: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1388-1388: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1390-1390: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1409-1409: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1411-1411: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1431-1431: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1451-1451: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1463-1463: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1507-1507: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1523-1523: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
1572-1572: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
1752-1752: Bare URL used
(MD034, no-bare-urls)
1753-1753: Bare URL used
(MD034, no-bare-urls)
🔇 Additional comments (9)
CDC_VERSION_SYNC_RESEARCH_SUMMARY.md (1)
1-256: Executive summary is well-structured but document depends on verification of critical datashard claims.This document makes definitive safety claims about datashard behavior (e.g.,
Y_VERIFY_DEBUG_Sbeing DEBUG-only). While section 2.2 of DATASHARD_VERSION_VALIDATION_ANALYSIS.md provides supporting analysis, this document would benefit from either:
- Inline citations to the specific datashard code locations (e.g., line 1725 of datashard.cpp)
- A "Verified" section showing the evidence for each critical claim
The structure and recommendations (sections 4-7) are clear and actionable. Consider this document ready for circulation to architects as a summary, but verify the datashard behavior claims are accurate before committing to the implementation strategy.
DATASHARD_VERSION_VALIDATION_ANALYSIS.md (1)
86-99: Critical claim about Y_VERIFY_DEBUG_S requires one-time verification.This section claims that
Y_VERIFY_DEBUG_Sat line 1725 is a "DEBUG-only check" with no-op in release builds. This is the linchpin of your entire safety argument. Before finalizing the implementation strategy:
- Confirm Y_VERIFY_DEBUG_S macro definition in YDB codebase
- Verify it uses
#ifdef, not runtime flag- Document the macro location for future reference
Once verified, this is excellent evidence. If it's wrong, the entire strategy may need revision.
cdc_version_sync_design.md (1)
737-818: Strategy E lock-free helping implementation looks sound, but needs one clarification.The pseudocode at line 749-817 shows the helping pattern clearly. One question for validation:
At line 787, the code checks
if (maxVersion > myVersion)and updates self. But what if another operation just incremented myVersion to match maxVersion between when we computed max and when we try to update?Idempotency should handle this (writing same value twice is safe), but explicitly stating "idempotent because we use assignment (=) not increment (+=)" would clarify this for readers.
ss2.md (1)
1-50: Excellent foundational documentation, but unclear if this belongs in the incremental restore PR.This document appears to be comprehensive SchemeShard architecture documentation. It's well-written and provides valuable context, but the PR objectives focus specifically on "implement incremental restore for global indexes."
Questions:
- Is this new documentation being added as supporting material for the PR, or existing documentation being refactored?
- Should this be in a separate documentation PR?
- Is it necessary for understanding the incremental restore changes, or is it background material?
If essential to the incremental restore changes, it should stay. If background material, consider moving to a separate documentation PR to keep this PR focused.
strategy_e_implementation_research.md (5)
407-492: REPLACEMENT CODE section is well-structured but needs verification before use.Lines 407-492 show the replacement code for TProposeAtTable::HandleReply. This looks good but should be verified:
- Does
BuildTableVersionContextstill exist at these lines?- Is the
IsPartOfContinuousBackupflag available?- Have the persistence/publish functions changed?
Before implementation, compare this code against current YDB codebase to ensure it will compile. Consider adding a code review checkpoint here.
509-722: HelpSyncSiblingVersions implementation is detailed but pseudocode vs production quality unclear.Lines 509-722 provide extensive pseudocode for the helping synchronization function. The logic is clear and well-commented. However:
- Error handling: What happens if
PathsByIddoesn't contain an expected entry? Current code hasat()which throws, but maybe should usecontains()check?- Logging overhead: Extensive logging is good for debugging, but 10+ LOG_* calls per operation may impact production performance. Consider making some DEBUG-level or conditional.
- Actual implementation: Is this pseudocode meant to be copy-pasted, or is it for understanding the algorithm?
Clarify expectations for developers using this pseudocode.
1220-1277: Unit test examples provided are valuable but use fictional APIs.Lines 1226-1276 show unit test pseudocode using
TTestEnv,CreateTableWithIndexes,TriggerIncrementalBackup, etc. These look like they might be real YDB test APIs, but without seeing the actual test files, I can't verify they exist.Before implementation:
- Verify these test helper functions exist
- Check if any need to be created
- Ensure test environment matches actual YDB test framework
These test examples are excellent templates but need validation against actual YDB test infrastructure.
1320-1347: Debugging strategies section is practical and well-organized.Lines 1320-1347 show realistic log output patterns and how to interpret them. This is excellent production debugging guidance. The three scenarios (successful sync, helping sync, self catching up) are clear and concrete.
This section should be added to internal documentation/runbooks so on-call engineers can use it when troubleshooting version sync issues.
1572-1661: Conclusion clearly states this is NOT optional - it's a required fix for existing race condition.Lines 1572-1661 make a strong case that Strategy E implementation is necessary (not optional) because:
- Existing code has proven race condition
- Current
SyncImplTableVersionis fundamentally broken- Strategy E is the fix
This framing is important and correct. The conclusion correctly positions this as fixing a bug, not adding a feature. However, verify that the "proven race condition" claim is well-documented in the earlier sections for readers who skip to the conclusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (2)
571-596: Consider making version diagnostics conditional.The diagnostics block outputs detailed schema version information to stderr unconditionally. While helpful for debugging, this creates verbose test output.
Consider wrapping this in a conditional based on a test verbosity flag or only enabling it on test failure:
- // Add version diagnostics after backup - { - Cerr << "========== VERSION DIAGNOSTICS AFTER BACKUP ==========" << Endl; - - auto describeTable = Ls(runtime, edgeActor, "/Root/TableWithIndex"); - if (!describeTable->ResultSet.empty() && describeTable->ResultSet[0].TableId) { - Cerr << "Main table SchemaVersion: " << describeTable->ResultSet[0].TableId.SchemaVersion << Endl; + // Add version diagnostics after backup for debugging + if constexpr (false) { // Enable for debugging + Cerr << "========== VERSION DIAGNOSTICS AFTER BACKUP ==========" << Endl; + + auto describeTable = Ls(runtime, edgeActor, "/Root/TableWithIndex");
3091-3091: Consider explicit CDC stream initialization verification.The tests use fixed sleep times (5 seconds) after full backups to allow CDC streams to initialize. While
SimulateSleepis deterministic in tests, consider whether there are explicit synchronization points that could be used instead of time-based waits.This is not critical since the current approach works, but for improved clarity and maintainability, consider:
- Adding a helper function to verify CDC stream creation status
- Using event-based waiting similar to
WaitTxNotificationExample pattern:
// Instead of: SimulateSleep(server, TDuration::Seconds(5)); // Consider: WaitForCdcStreamReady(server, edgeActor, "/Root/TableWithIndex");This would make the tests more self-documenting and robust.
Also applies to: 3329-3329
ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp (2)
117-274: Consider breaking down the function for improved maintainability.
HelpSyncSiblingVersionsis 159 lines long and performs 6 distinct steps. While the current implementation is clear with step-by-step comments, extracting helper functions for collection (steps 1-2) and update (steps 4-5) logic could improve testability and maintainability.For example:
CollectIndexFamily()- Steps 1-2ComputeMaxVersion()- Step 2UpdateIndexEntities()- Steps 4-5
373-373: Minor: Redundant namespace qualification.The call
NCdcStreamState::SyncIndexEntityVersion()is already within theNCdcStreamStatenamespace (line 9), so the namespace prefix is redundant. You can callSyncIndexEntityVersion()directly for cleaner code.- NCdcStreamState::SyncIndexEntityVersion(childPathId, targetVersion, operationId, context, db); + SyncIndexEntityVersion(childPathId, targetVersion, operationId, context, db);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp(3 hunks)ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (1)
ydb/core/tx/schemeshard/ut_backup_collection/ut_backup_collection.cpp (2)
SetupLogging(14-17)SetupLogging(14-14)
🔇 Additional comments (9)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (4)
3111-3160: Well-designed verification of OmitIndexes behavior.The test properly uses dynamic backup directory resolution and DescribePath verification to confirm that:
- The main table incremental backup was created correctly
- The index metadata directory does NOT exist when OMIT_INDEXES is set
This approach is more robust than relying on query failures and provides clear verification of the expected behavior.
3162-3274: Comprehensive test coverage for index restore.The new test suite provides excellent coverage of incremental restore with indexes:
- Single index (BasicIndexIncrementalRestore)
- Multiple indexes (MultipleIndexesIncrementalRestore)
- Cross-shard scenarios (IndexDataVerificationIncrementalRestore)
- Sequential incremental backups (MultipleIncrementalBackupsWithIndexes)
- Multiple tables (MultipleTablesWithIndexesIncrementalRestore)
Each test properly verifies:
- Table data correctness
- Index functionality via VIEW queries
- Index implementation table data integrity
- Proper handling of updates and deletes
2929-2933: Verify CDC operation counts remain stable.The test asserts specific CDC operation counts (2 DELETEs, 1 INSERT) parsed from metadata. While appropriate for verifying exact CDC behavior, these assertions are tightly coupled to the current implementation.
If CDC operation generation logic changes in the future (e.g., optimizations that combine operations or different tombstone strategies), these assertions may need updates. Consider documenting why these specific counts are expected:
// Expected counts based on: // - 2 DELETEs: tombstones for old values at age 25 (Bob's original) and 26 (Bob's intermediate update) // - 1 INSERT: new value for Alice2 UNIT_ASSERT_EQUAL_C(counts.Deletes, 2, "Should have 2 DELETE operations (tombstones for age 25 and 26)");
3162-3766: Excellent test coverage for index incremental restore feature.The test suite comprehensively validates the implementation described in the PR objectives:
- Index discovery: Tests verify that indexes are properly discovered and backed up in
__ydb_backup_meta/indexes/directory structure- Restore operations: Each test verifies indexes work after restore by:
- Querying via VIEW to confirm index functionality
- Directly querying index implementation tables to verify data integrity
- Checking row counts match expected values
- Various scenarios: Coverage includes single indexes, multiple indexes, covering indexes, multi-shard tables, sequential incremental backups, and multiple tables
The tests properly verify both the functional behavior (indexes work) and the internal state (impl tables have correct data), which gives confidence that the restore logic correctly reconstructs the index structures.
ydb/core/tx/schemeshard/schemeshard__operation_common_cdc_stream.cpp (5)
2-2: LGTM - Include for new CDC stream common functionality.The new include provides the necessary forward declarations and API exposure for the Strategy E implementation.
13-114: LGTM - Well-structured helper functions for continuous backup detection.The helper functions provide clear separation of concerns for detecting continuous backup streams and building context. The logic for traversing parent/grandparent relationships is correct.
279-342: LGTM - Enhanced logging for version synchronization.The extensive logging additions will help diagnose version synchronization issues. The core logic correctly prevents version downgrades with the
targetVersion > oldIndexVersioncheck.
492-546: LGTM - Strategy E implementation with proper flow separation.The logic correctly separates three paths:
- Index impl table CDC (lines 497-531): Increments own version atomically, then uses lock-free helping to sync sibling index entities
- Other tables (lines 532-541): Simple version increment
- Main table CDC (lines 544-546): Additionally syncs child index entities
The separation ensures that index impl tables handle their own version updates (avoiding datashard schema mismatches), while index entities are synchronized by whoever observes the need.
115-274: Verify operation execution model to assess race condition validity.The review comment correctly identifies a theoretical race condition, but the actual risk depends on YDB's operation scheduling model. The code shows:
- All operations process through message handlers (HandleReply)
- YDB uses an actor model with tablet executors
- If operations on the same parent table serialize through the tablet's message queue, no concurrency exists
- If operations can interleave, the version comparison alone doesn't prevent lost updates
The version check (
if (index->AlterVersion < maxVersion)) prevents downgrades but not lost updates in a concurrent scenario:
- Operation A reads
v=5, Operation B readsv=5- Both compute
maxVersion=6, both write6- A concurrent third operation's version increment gets lost
Questions for verification:
- Do all CDC operations on indexes of the same parent table serialize through a single schemeshard tablet's message queue, or can they execute concurrently on different threads?
- Are there any async update paths or background tasks that could modify
IndexesorTablesstructures whileHelpSyncSiblingVersionsexecutes?- Does the existing test suite verify concurrent CDC operations on multiple indexes of the same table, and do they pass?
| Y_ABORT_UNLESS(indexPath->GetChildren().size() == 1); | ||
| auto [implTableName, implTablePathId] = *indexPath->GetChildren().begin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace hard assertion with graceful validation.
The Y_ABORT_UNLESS will crash the schemeshard if an index entity doesn't have exactly one child. This assumption could be violated during concurrent modifications, index deletion, or corrupted state.
Apply this diff to handle the edge case gracefully:
// Get index impl table (single child of index entity)
auto indexPath = context.SS->PathsById.at(childPathId);
- Y_ABORT_UNLESS(indexPath->GetChildren().size() == 1);
+ if (indexPath->GetChildren().size() != 1) {
+ LOG_WARN_S(context.Ctx, NKikimrServices::FLAT_TX_SCHEMESHARD,
+ "Index entity does not have exactly one child, skipping"
+ << ", indexPathId: " << childPathId
+ << ", childCount: " << indexPath->GetChildren().size()
+ << ", at schemeshard: " << context.SS->SelfTabletId());
+ continue;
+ }
auto [implTableName, implTablePathId] = *indexPath->GetChildren().begin();Committable suggestion skipped: line range outside the PR's diff.
Problem:
Implementation:
Index Discovery and Restore:
Schema Version Synchronization:
Testing:
Files Modified:
This ensures indexes are properly restored and remain queryable.### Changelog entry
...
Changelog category
Description for reviewers
...