Skip to content

Conversation

@rasifr
Copy link
Member

@rasifr rasifr commented Feb 4, 2026

The SpockApplyProgress struct written to WAL at commit time was not including remote_insert_lsn, causing it to default to 0. After crash recovery, WAL replay would restore progress with remote_insert_lsn = 0, breaking lag tracking in spock.lag_tracker.

The fix includes the current remote_insert_lsn value from shared memory when building the WAL record. This value was already updated by UpdateWorkerStats() earlier in the same transaction.

Clean shutdowns were not affected as resource.dat saves the complete shared memory state.

Added TAP test 016_crash_recovery_progress.pl to verify remote_insert_lsn survives crash recovery.

@coderabbitai
Copy link

coderabbitai bot commented Feb 4, 2026

📝 Walkthrough

Walkthrough

COMMIT handling now fills remote_insert_lsn into the per-transaction SpockApplyProgress before WAL write and shmem update so it is durable across crashes. A new TAP test validates that remote_insert_lsn persists through a forced crash and subsequent recovery.

Changes

Cohort / File(s) Summary
Progress Tracking Enhancement
src/spock_apply.c
Populate remote_insert_lsn into the SpockApplyProgress during per-commit progress updates so it is written to WAL/shmem for crash-recovery persistence.
Crash Recovery Test
tests/tap/t/016_crash_recovery_progress.pl
New TAP test that creates a 2-node cluster, replicates rows, captures remote_insert_lsn, kills and restarts the subscriber, and asserts the LSN persists after recovery.
Test Schedule Update
tests/tap/schedule
Added 016_crash_recovery_progress to the test schedule so the new test is executed with the suite.

Poem

🐰
I nibble logs by moonlit stream,
I guard each LSN and dream.
When sparks and crashes wake the night,
My progress holds — steady and light.
Hooray for tests that prove it right!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main fix: ensuring remote_insert_lsn persists through crash recovery, which directly aligns with the core changeset.
Description check ✅ Passed The description is directly related to the changeset, explaining the bug, the fix, and test coverage with appropriate technical detail.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch task/SPOC-421/fix-remote-insert-lsn-crash-recovery

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

The SpockApplyProgress struct written to WAL at commit time was not
including remote_insert_lsn, causing it to default to 0. After crash
recovery, WAL replay would restore progress with remote_insert_lsn = 0,
breaking lag tracking in spock.lag_tracker.

The fix includes the current remote_insert_lsn value from shared memory
when building the WAL record. This value was already updated by
UpdateWorkerStats() earlier in the same transaction.

Clean shutdowns were not affected as resource.dat saves the complete
shared memory state.

Added TAP test 016_crash_recovery_progress.pl to verify remote_insert_lsn
survives crash recovery.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@rasifr rasifr force-pushed the task/SPOC-421/fix-remote-insert-lsn-crash-recovery branch from 24172e9 to 60497f0 Compare February 5, 2026 15:38
@mason-sharp mason-sharp merged commit a43f4b8 into main Feb 5, 2026
10 checks passed
@mason-sharp mason-sharp deleted the task/SPOC-421/fix-remote-insert-lsn-crash-recovery branch February 5, 2026 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants