Conversation
benbjohnson
left a comment
There was a problem hiding this comment.
@corylanou I'm not sure we can rely on mtime as its granularity may be too large. You could have a WAL write, then a sync, and then another WAL write all within the window of the mtime granularity (such that the mtime before the sync and after the sync are the same).
Are we implementing a file watcher on the WAL? I can't remember if that ever made it in there.
|
Implemented WAL change detection without relying on filesystem mtime granularity. Instead of mtime, DB.monitor() now compares the WAL-index (-shm) mxFrame value (plus WAL size & WAL header). mxFrame advances on each commit even when SQLite reuses WAL space, and it avoids missing writes on coarse mtime filesystems. Also fixed a related restore/compaction edge case exposed by the integration suite: when SQLite bumps the post-commit DB size by multiple pages but only some of the new pages appear in the WAL, we now encode zero-filled pages for the missing newly-added page numbers so compaction can always rebuild a valid snapshot. Re: your question: there still isn't an fsnotify-based watcher on the WAL itself (only directory watching in cmd/litestream/directory_watcher.go); this remains polling-based via the monitor loop. |
Summary
Use WAL mtime in the monitor loop’s cheap change detection and add a regression test for mtime-only WAL changes.
Problem
Issue #1037 reports replication stalling when WAL size and header remain unchanged even though writes continue. SQLite can reuse WAL space, so size/header can stay constant while WAL mtime changes on each write. The current optimization skips Sync() in that case, so replication silently stalls.
Solution
Track WAL mtime alongside size and header. Skip Sync() only if size, header, and mtime are unchanged.
Scope
In scope:
Not in scope:
Test Plan
go test -run TestDB_Monitor_ -v ./...Related