fix: prevent duplicate rows during MSSQL CDC backfill by waiting for the async capture to agent catch-up#843
Open
vishalm0509 wants to merge 13 commits intostagingfrom
Open
fix: prevent duplicate rows during MSSQL CDC backfill by waiting for the async capture to agent catch-up#843vishalm0509 wants to merge 13 commits intostagingfrom
vishalm0509 wants to merge 13 commits intostagingfrom
Conversation
…ix/mssql_cdc_cursor
…ix/mssql_cdc_cursor
vikaxsh
reviewed
Mar 7, 2026
vikaxsh
reviewed
Mar 14, 2026
|
|
||
| if !hasPermission { | ||
| logger.Warnf("VIEW DATABASE STATE permission not granted; LSN may be lagging behind the transaction log") | ||
| return m.currentMaxLSN(ctx) |
Collaborator
There was a problem hiding this comment.
can we make this split fallback logic at one place
Co-authored-by: vishal-datazip <vishal@datazip.io>
…/olake into fix/mssql_cdc_cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
sys.fn_cdc_get_max_lsn()can lag behind thetransaction log at the start of a sync. This causes the backfill to read rows that later appear
again in the CDC change stream, producing duplicates.
sys.dm_cdc_log_scan_sessionsuntil theagent completes a non-throttled scan (tran_count < maxtrans), ensuring the CDC max LSN reflects
all committed transactions.
VIEW DATABASE STATE/VIEW DATABASE PERFORMANCE STATEpermission ismissing, falls back to the previous behavior with a warning.
Fixes # (issue)
Type of change
How Has This Been Tested?
Screenshots or Recordings
Documentation
Related PR's (If Any):