-
Notifications
You must be signed in to change notification settings - Fork 4
fsync where it matters #593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds explicit file synchronization (Sync) and defensive size verification after writing/copying c1z files in both local and S3 managers; errors are wrapped with context when sync/stat/size checks fail. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/dotc1z/manager/s3/s3.go (1)
66-79: Consider removing redundant explicit Close() calls.The explicit
_ = f.Close()calls on error paths (lines 66, 73, 77) are redundant sincedefer f.Close()at line 51 will execute when the function returns. WhileClose()is idempotent and this doesn't cause bugs, it adds confusion about resource management strategy.If the goal is to release resources immediately on error, the Go idiom is:
- if err := f.Sync(); err != nil { - _ = f.Close() - return fmt.Errorf("failed to sync temp file: %w", err) - } + if err := f.Sync(); err != nil { + return fmt.Errorf("failed to sync temp file: %w", err) + }Apply similar changes to lines 73 and 77. The defer will handle cleanup in all cases.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pkg/dotc1z/manager/local/local.go(2 hunks)pkg/dotc1z/manager/s3/s3.go(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: go-test (1.25.2, ubuntu-latest)
- GitHub Check: go-test (1.25.2, windows-latest)
- GitHub Check: go-lint
🔇 Additional comments (3)
pkg/dotc1z/manager/local/local.go (2)
66-72: LGTM! Sync ensures data durability before use.The Sync() call correctly ensures that buffered data is flushed to disk before the temporary file is read. This prevents the ZFS ARC race condition described in the PR objectives.
157-161: LGTM! Defensive Sync prevents incomplete writes.The Sync() call ensures the destination file is fully written before SaveC1Z returns, preventing downstream code from reading incomplete data.
Note: Unlike s3.go (lines 70-79), this doesn't include a defensive size verification comparing the bytes written to the actual file size. While local file operations are typically more reliable than network transfers, consider whether adding similar verification here would be beneficial for consistency.
pkg/dotc1z/manager/s3/s3.go (1)
56-68: Excellent! Sync and size verification enhance data integrity.The changes correctly:
- Capture bytes written for verification
- Sync to flush buffers before subsequent reads
- Wrap errors with clear context
The defensive size check at lines 70-79 is particularly valuable for detecting buffered write issues in network transfers.
|
|
||
| if r != nil { | ||
| _, err = io.Copy(f, r) | ||
| written, err := io.Copy(f, r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call io.copy() in local manager too. Should we error if the bytes written in there don't match? I think we should either check the size vs bytes written and error in both cases, or error in neither case, not just the s3 case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/dotc1z/manager/local/local.go (1)
176-180: Sync is correct; consider adding size verification for consistency.The sync call properly addresses the ZFS ARC race condition. However,
copyFileToTmpincludes defensive size verification (comparing source, written, and destination sizes), while this function only syncs without verifying the bytes written match the source.For consistency and equal protection, consider adding size verification here as well.
Apply this diff to add defensive size verification:
tmpFile, err := os.Open(l.tmpPath) if err != nil { return err } defer tmpFile.Close() + + // Get source file size for verification + tmpStat, err := tmpFile.Stat() + if err != nil { + return fmt.Errorf("failed to stat temp file: %w", err) + } + expectedSize := tmpStat.Size() dstFile, err := os.Create(l.filePath) if err != nil { return err } defer dstFile.Close() size, err := io.Copy(dstFile, tmpFile) if err != nil { return err } // CRITICAL: Sync to ensure data is written before function returns. // This is especially important on ZFS ARC where writes may be cached. if err := dstFile.Sync(); err != nil { return fmt.Errorf("failed to sync destination file: %w", err) } + + // Verify file size matches what we wrote (defensive check) + dstStat, err := dstFile.Stat() + if err != nil { + return fmt.Errorf("failed to stat destination file: %w", err) + } + if dstStat.Size() != size { + return fmt.Errorf("file size mismatch: wrote %d bytes but file is %d bytes", size, dstStat.Size()) + } + if size != expectedSize { + return fmt.Errorf("copy size mismatch: expected %d bytes from temp but copied %d bytes", expectedSize, size) + }
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pkg/dotc1z/manager/local/local.go(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: go-test (1.25.2, windows-latest)
- GitHub Check: go-test (1.25.2, ubuntu-latest)
🔇 Additional comments (1)
pkg/dotc1z/manager/local/local.go (1)
62-91: Excellent defensive I/O with three-way size verification.The implementation correctly:
- Captures source size before copy
- Syncs before verification (ensuring write buffers are flushed)
- Verifies temp file size matches both io.Copy's return value and the source size
This three-way check (expectedSize == written == stat.Size()) provides strong guarantees against data corruption or incomplete writes, which is critical for the ZFS ARC race condition described in the PR.
Add file syncs to prevent ZFS ARC race condition by making sure that files are fully written before being read.
Add Sync() after io.Copy() in copyToTempFile() (s3 manager) - required for immediate reads
Add Sync() after io.Copy() in copyFileToTmp() (local manager) - required for immediate reads
Add Sync() after io.Copy() in SaveC1Z() (local manager) - defensive
On ZFS ARC, files read immediately after write can see incomplete data from write buffers. Explicit sync ensures data is flushed before reads.
Guessing this will adds ~1-5ms latency per file operation, negligible compared to network/compression overhead.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.