Add filesystem zero copy during pause via reflink (XFS, Btrfs)#2249
Add filesystem zero copy during pause via reflink (XFS, Btrfs)#2249
Conversation
PR SummaryMedium Risk Overview Written by Cursor Bugbot for commit 0af2e06. This will update automatically on new commits. Configure here. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bb3920427b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
I also added measurements of how long parts of the rootfs pause (flush, copy, etc.) take to the trace attributes. |
matthewlouisbrockman
left a comment
There was a problem hiding this comment.
ran tests to make sure no degredation #2252 that all pass
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Autofix Details
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: No fallback when range copy unsupported
- Added fallback to regular I/O (io.CopyN) when CopyFileRange fails, allowing diff export to work across different filesystems and mounts.
- ✅ Fixed: Hard failure on unsupported sync syscall
- Made SyncFileRange non-fatal by ignoring its errors since it's an optimization, not a correctness requirement.
Or push these changes by commenting:
@cursor push d89a067423
Preview (d89a067423)
diff --git a/packages/orchestrator/pkg/sandbox/block/cache.go b/packages/orchestrator/pkg/sandbox/block/cache.go
--- a/packages/orchestrator/pkg/sandbox/block/cache.go
+++ b/packages/orchestrator/pkg/sandbox/block/cache.go
@@ -4,6 +4,7 @@
"context"
"errors"
"fmt"
+ "io"
"math"
"math/rand"
"os"
@@ -129,10 +130,8 @@
// Explicit mmap flush is not necessary, because the kernel will handle that as part of the copy_file_range syscall.
// Calling sync_file_range marks the range for writeback and starts it early.
- err = unix.SyncFileRange(src, 0, c.size, unix.SYNC_FILE_RANGE_WRITE)
- if err != nil {
- return nil, fmt.Errorf("error syncing file: %w", err)
- }
+ // This is an optimization and not required for correctness, so we ignore errors.
+ _ = unix.SyncFileRange(src, 0, c.size, unix.SYNC_FILE_RANGE_WRITE)
buildStart := time.Now()
builder := header.NewDiffMetadataBuilder(c.size, c.blockSize)
@@ -150,6 +149,7 @@
dst := int(out.Fd())
var writeOffset int64
var totalRanges int64
+ useFallback := false
copyStart := time.Now()
for r := range BitsetRanges(diffMetadata.Dirty, diffMetadata.BlockSize) {
@@ -157,27 +157,46 @@
remaining := int(r.Size)
readOffset := r.Start
- // The kernel may return short writes (e.g. capped at MAX_RW_COUNT on non-reflink filesystems),
- // so we loop until the full range is copied. The offset pointers are advanced by the kernel.
- for remaining > 0 {
- // On XFS this uses reflink automatically.
- n, err := unix.CopyFileRange(
- src,
- &readOffset,
- dst,
- &writeOffset,
- remaining,
- 0,
- )
- if err != nil {
- return nil, fmt.Errorf("error copying file range: %w", err)
+ if !useFallback {
+ // The kernel may return short writes (e.g. capped at MAX_RW_COUNT on non-reflink filesystems),
+ // so we loop until the full range is copied. The offset pointers are advanced by the kernel.
+ for remaining > 0 {
+ // On XFS this uses reflink automatically.
+ n, err := unix.CopyFileRange(
+ src,
+ &readOffset,
+ dst,
+ &writeOffset,
+ remaining,
+ 0,
+ )
+ if err != nil {
+ // Fall back to regular I/O if copy_file_range fails (e.g., cross-filesystem or unsupported).
+ useFallback = true
+ break
+ }
+
+ if n == 0 {
+ return nil, fmt.Errorf("copy_file_range returned 0 with %d bytes remaining", remaining)
+ }
+
+ remaining -= n
}
+ }
- if n == 0 {
- return nil, fmt.Errorf("copy_file_range returned 0 with %d bytes remaining", remaining)
+ if useFallback && remaining > 0 {
+ // Use regular I/O as fallback
+ if _, err := f.Seek(readOffset, io.SeekStart); err != nil {
+ return nil, fmt.Errorf("error seeking source file: %w", err)
}
-
- remaining -= n
+ if _, err := out.Seek(writeOffset, io.SeekStart); err != nil {
+ return nil, fmt.Errorf("error seeking destination file: %w", err)
+ }
+ n, err := io.CopyN(out, f, int64(remaining))
+ if err != nil {
+ return nil, fmt.Errorf("error copying data: %w", err)
+ }
+ writeOffset += n
}
}This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
0ca57c5 to
d80183b
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
In nbd.ExportDiff() we were returning early in case something was going wrong downstream without calling cache.Close(). This causes a leak of mmap'ed memory. Make sure we call cache.Close() in those cases. cache.Close() itself, might fail. In these cases, log a warning and return the initial error. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
SyncFileRange is an optimization to let the kernel start wrting back data from the mmap to the disk on XFS. If that fails, we can rely on the CopyFileRange to actually performs the operation. Just log a warning on SyncFileRange failures and hard fail on CopyFileRange. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
When copy_file_range fails with EXDEV, EOPNOTSUPP, or ENOSYS (e.g. cross-device copies or filesystems that don't support the syscall), fall back to a regular io.Copy for the remainder of the export. Other errors still hard-fail. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
mmap.MapRegion expects the library's own constants (mmap.RDWR), not raw unix PROT_* flags. Passing unix.PROT_READ|unix.PROT_WRITE (=3) happened to work due to numerical coincidence, but is fragile. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
|
I've rebased the tests from here: #2252 and let them re-run. Tests are passing. |


Check/add tests or code for the following things:
CopyFileRangefails (due to copying across disks or because the system call is not supported), we fall back to normal copy