-
Notifications
You must be signed in to change notification settings - Fork 0
WIP #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
WIP #17
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughSummary by CodeRabbitRelease Notes
WalkthroughAdds index incremental-restore support: discovers backed-up global indexes, maps backup-relative index paths to restore targets, creates per-index restore operations in parallel with table restores, registers index ops in unified in-progress tracking, and finalizes activation atomically once all ops complete. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant TSchemeShard
participant CreateInc as CreateIncrementalRestoreOperation
participant DiscoverIdx as DiscoverAndCreateIndexRestoreOperations
participant CreateIdx as CreateSingleIndexRestoreOperation
participant Tracking as InProgressOperations
User->>TSchemeShard: Request Incremental Restore
TSchemeShard->>CreateInc: Prepare table restore operations
CreateInc->>Tracking: Register table ops
Note over CreateInc: Parallel index discovery & creation (new)
CreateInc->>DiscoverIdx: Discover backed-up indexes
DiscoverIdx->>DiscoverIdx: Scan indexes metadata recursively
loop For each discovered index
DiscoverIdx->>CreateIdx: Create per-index restore op
CreateIdx->>CreateIdx: Validate target, locate indexImplTable, form restore request
CreateIdx->>Tracking: Register index op
end
rect rgba(200,220,255,0.25)
Note over Tracking: Unified tracking for table & index operations
end
Tracking->>TSchemeShard: All ops complete → Finalize (atomic activation)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
INCREMENTAL_RESTORE_INDEXES_PLAN.md(1 hunks)ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp(1 hunks)ydb/core/tx/schemeshard/schemeshard_impl.h(1 hunks)ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp(1 hunks)
ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp
Outdated
Show resolved
Hide resolved
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (1)
2522-3124: Consider factoring common restore scaffoldingThe five new
Y_UNIT_TESTblocks all replay the same backup/restore choreography (create tables + indexes → run full backup → run one or more incremental backups → drop tables → RESTORE → repeat the same verification queries), with only the table shape or number of indexes varying. Could we pull the shared setup/restore/assert steps into helpers (e.g. aRunIncrementalIndexScenariothat takes table/index definitions and expected rows)? That would collapse a few hundred duplicated lines and make future scenarios cheaper to add.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
INCREMENTAL_RESTORE_INDEXES_PLAN.md(1 hunks)ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp(1 hunks)ydb/core/tx/schemeshard/schemeshard_impl.h(1 hunks)ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp(1 hunks)
🔇 Additional comments (3)
ydb/core/tx/schemeshard/schemeshard_impl.h (1)
1197-1225: Header stays in lockstepThanks for keeping the private declarations in the header in sync with the new helpers — that keeps navigation between declaration/definition straightforward.
INCREMENTAL_RESTORE_INDEXES_PLAN.md (1)
43-208: Implementation plan hits the right checkpointsAppreciate that the plan calls out the phased rollout (“discover → helper → progress → finalize → tests”). That matches the code changes and makes the review a lot easier to follow.
ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp (1)
611-860: Great to see indexes wired into incremental restore flowReally happy to see
DiscoverAndCreateIndexRestoreOperationssharing the same state accounting as the table path (especially adding the index shard set intoExpectedShards). That lets the existing progress tracking/finalization machinery work without special-casing index jobs.
This document describes the architecture and implementation approach for extending incremental restore functionality to support global indexes, complementing the backup support already implemented in branch feature/incr-backup/indexes-support-003.
This commit adds support for restoring global indexes during incremental restore operations, extending the existing table-only restore functionality. Key changes: 1. Index Discovery (DiscoverAndCreateIndexRestoreOperations): - Scans __ydb_backup_meta/indexes directory in backup - Discovers index backups for each table - Validates OmitIndexes flag from backup config 2. Index Restore Operations (CreateSingleIndexRestoreOperation): - Creates parallel restore operations for index implementation tables - Validates indexes exist on target tables - Filters for EIndexTypeGlobal indexes only (matches backup behavior) - Tracks operations same as table operations for unified completion 3. Path Mapping (FindTargetTablePath): - Maps relative backup paths to target table paths - Uses backup collection's ExplicitEntryList for mapping Architecture: - Tables and indexes restore IN PARALLEL (no artificial dependencies) - Both tracked in same InProgressOperations set - Atomic activation when ALL operations complete (via existing finalization) - Gracefully handles missing indexes and OmitIndexes flag The implementation reuses existing restore operation types and tracking mechanisms, requiring no changes to proto definitions or finalization logic. Related to: feature/incr-backup/indexes-support-003
This commit adds 5 comprehensive test cases covering all aspects of incremental restore for global indexes: 1. BasicIndexIncrementalRestore: - Tests single global index restore - Verifies index functionality after restore - Tests querying via index after restoration 2. MultipleIndexesIncrementalRestore: - Tests table with 3 global indexes - Verifies all indexes restore correctly - Validates parallel index restore functionality 3. IndexDataVerificationIncrementalRestore: - Multi-shard table with index - Tests INSERT, UPDATE, DELETE operations across shards - Verifies index data integrity after restore - Compares before/after restore index query results 4. MultipleIncrementalBackupsWithIndexes: - Tests sequence of 3 incremental backups - Each backup has different operations (add, update, delete) - Verifies indexes restored correctly across all increments 5. MultipleTablesWithIndexesIncrementalRestore: - Tests backup collection with 2 tables - Each table has different global index - Verifies parallel restore of multiple table+index combinations Test Coverage: - Basic index restore ✓ - Multiple indexes per table ✓ - Multi-shard index tables ✓ - Index data verification ✓ - Multiple incremental sequences ✓ - Multiple tables with indexes ✓ - Parallel completion ✓ (implicitly tested in all tests) - Index queries after restore ✓ Total: 506 lines of test code added
Enhanced all 5 index restore tests to directly query index implementation
tables, providing stronger verification that indexes are correctly restored.
Changes per test:
1. BasicIndexIncrementalRestore:
- Query indexImplTable directly to verify structure
- Count rows in index impl table (should be 4)
- Verify specific indexed_col values (100, 250, 300, 400)
2. MultipleIndexesIncrementalRestore:
- Query all 3 index impl tables (index1, index2, index3)
- Verify row counts for each (all should have 4 rows)
- Spot-check index3 impl table data
3. IndexDataVerificationIncrementalRestore:
- Query age_index impl table directly
- Verify correct data after INSERT/UPDATE/DELETE operations
- Confirm deleted records NOT in index impl table
- Verify row count (should be 4 after deletions)
4. MultipleIncrementalBackupsWithIndexes:
- Query idx impl table after 3 incremental backups
- Verify old values removed (100, 200)
- Verify new/updated values present (250, 300, 400)
- Confirm final row count (should be 3)
5. MultipleTablesWithIndexesIncrementalRestore:
- Query both idx1 and idx2 impl tables
- Verify row counts for both (should be 3 each)
- Spot-check data in both index impl tables
Benefits:
- Stronger verification that index tables were actually restored
- Tests verify index table structure is intact
- Validates index data matches expected state
- Ensures index impl tables are queryable after restore
- Confirms parallel restore of multiple indexes
Pattern used: `/Root/{table}/{index}/indexImplTable`
Key insight: Index columns appear first, then main table key columns
Total: +98 lines of enhanced test verification
Addresses two critical PR review comments:
1. Fix nested table path discovery (CRITICAL BUG):
- Previous: Only looked 1 level deep in __ydb_backup_meta/indexes/
- Impact: Tables in subdirectories (e.g. /Root/Dir/Foo) had indexes silently skipped
- Fix: Implemented recursive DiscoverIndexesRecursive() that:
* Traverses the full directory tree under indexes/
* Accumulates relative path as it descends (Root -> Root/Dir -> Root/Dir/Foo)
* Matches against target tables at each level
* Creates index operations when match found
Example:
Backup structure: __ydb_backup_meta/indexes/Root/Dir/Foo/index1
Old behavior: Only saw "Root", failed to match, skipped all indexes
New behavior: Descends to "Root/Dir/Foo", matches table, restores indexes
2. Fix path matching to prevent false positives:
- Previous: itemPath.EndsWith(relativeTablePath)
- Problem: /Root/FooBar matched "Bar", /Root/users_table matched "table"
- Fix: Only accept exact match OR suffix preceded by "/"
* itemPath == relativeTablePath
* itemPath.EndsWith("/" + relativeTablePath)
Changes:
- Added DiscoverIndexesRecursive() helper for recursive directory traversal
- Updated FindTargetTablePath() to use exact matching logic
- Updated DiscoverAndCreateIndexRestoreOperations() to call recursive helper
Files modified:
- schemeshard_impl.h: Added DiscoverIndexesRecursive() declaration
- schemeshard_incremental_restore_scan.cpp: Implemented recursive discovery
Both issues validated and confirmed as correct feedback.
d520a69 to
0197f82
Compare
Problem: - After restoring index impl tables, schema version mismatch occurred - Index metadata expected version X but impl table had version X+1 - Tests failed with: "schema version mismatch during metadata loading" Root Cause: - Index impl table restore incremented the impl table's schema version - Parent index metadata AlterVersion was not updated - This caused a mismatch when querying the restored table Solution: - Added AlterTableIndex operation after each index impl table restore - Mirrors CDC stream pattern for avoiding additional alterVersion - AlterTableIndex operation syncs index metadata with impl table version - Sets index state to Ready, which increments the index AlterVersion The fix creates two operations per index restore: 1. Restore impl table (increments impl table schema version) 2. AlterTableIndex at parent table (syncs index metadata AlterVersion) Reference pattern from schemeshard__operation_create_cdc_stream.cpp: When working on index impl tables, create AlterTableIndex operation to avoid schema version mismatch between index and impl table.
d298e46 to
252b8a1
Compare
Problem: - Index impl table restore incremented impl table's AlterVersion - Parent index metadata AlterVersion was not synchronized - Caused "schema version mismatch" errors during table queries Root Cause: - Index restore operations were created independently via Send() - No coordination between impl table restore and index metadata update - AlterTableIndex operations executed in wrong order or not at all Solution: - Added AlterTableIndex to result vector in CreateRestoreMultipleIncrementalBackups - Check if destination table parent IsTableIndex() (index impl table pattern) - If yes, add coordinated AlterTableIndex operation to result - Both operations execute as part of same parent operation This mirrors the CDC backup pattern where: 1. Operations are added to result vector for coordination 2. Index impl table operations include corresponding index metadata updates 3. Pattern: check workingDirPath.IsTableIndex(), add AlterTableIndex For restore, we check dstTablePath.Parent().IsTableIndex() since we're targeting the impl table directly, not the index path. Files changed: - schemeshard__operation_create_restore_incremental_backup.cpp: Added AlterTableIndex to result vector - schemeshard_incremental_restore_scan.cpp: Removed incorrect Send-based approach
Problem: - Index impl table restore incremented impl table's AlterVersion - Parent index metadata AlterVersion was not synchronized - Caused 'schema version mismatch' errors during table queries Root Cause: - Used Parent().IsTableIndex() which returned false for index impl tables - For index impl table path /Root/Table/Index/indexImplTable: - Parent().IsTableIndex() checks if parent PathType is TableIndex - This was incorrectly returning false Solution: - Changed check to dstTablePath.IsInsideTableIndexPath() - IsInsideTableIndexPath() correctly detects tables inside index structures - It skips the impl table level and checks if ancestor is a TableIndex - Now correctly identifies index impl tables and creates AlterTableIndex This ensures index metadata AlterVersion stays in sync with impl table.
This document describes the architecture and implementation approach for
extending incremental restore functionality to support global indexes,
complementing the backup support already implemented in branch
feature/incr-backup/indexes-support-003.### Changelog entry
...
Changelog category
Description for reviewers
...