Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion packages/envd/internal/api/init.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import (
"net/http"
"net/netip"
"os/exec"
"strings"
"sync"
"time"

Expand Down Expand Up @@ -231,10 +232,25 @@ func (a *API) SetData(ctx context.Context, logger zerolog.Logger, data PostInitJ
return nil
}

var nfsOptions = strings.Join([]string{
// wait for data to be sent to proxy server before returning.
// async might cause issues if the sandbox is shut down suddenly.
"sync",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NFS sync option not removed despite PR discussion agreement

Medium Severity

The sync NFS mount option appears to still be present despite the author agreeing in the PR discussion to remove it ("Despite the performance gains, I'll remove that one"). The sync option forces synchronous writes, which can significantly degrade NFS throughput (2-3x slower than the default async), partially negating the performance gains from the rsize/wsize buffer increases. This contradicts the PR's stated goal of improving performance by 5-10%.

Fix in Cursor Fix in Web

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol.

  • claude, locally: "we're missing async, that'll make it faster"
  • claude, in pr: "async will cause data issues, use sync"
  • cursor, in pr: "you still have sync, I thought you wanted to switch to async"

data corruption overrules speed, we're done here.


"rsize=1048576", // 1 MB read buffer
"wsize=1048576", // 1 MB write buffer
"mountproto=tcp", // nfs proxy only supports tcp
"mountport=2049", // nfs proxy only supports mounting on port 2049
"proto=tcp", // nfs proxy only supports tcp
"port=2049", // nfs proxy only supports mounting on port 2049
"nfsvers=3", // nfs proxy is nfs version 3
"noacl", // no reason for acl in the sandbox
}, ",")

func (a *API) setupNfs(ctx context.Context, nfsTarget, path string) {
commands := [][]string{
{"mkdir", "-p", path},
{"mount", "-v", "-t", "nfs", "-o", "fg,hard,mountproto=tcp,mountport=2049,proto=tcp,port=2049,nfsvers=3,noacl", nfsTarget, path},
{"mount", "-v", "-t", "nfs", "-o", "fg,hard," + nfsOptions, nfsTarget, path},
}

for _, command := range commands {
Comment on lines +235 to 256
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 This PR introduces sync on a hard-mounted NFS volume — a dangerous combination where every write blocks until the NFS proxy acknowledges it and hard means there is no timeout. If the NFS proxy becomes unreachable after the mount is established (crash, network partition, restart), any sandbox process writing to the volume will hang indefinitely in uninterruptible sleep (D state) with no way to kill it. A safer approach that still preserves write durability is sync + soft + timeo=N + retrans=M, which allows the kernel to eventually return an error to the application instead of blocking forever.

Extended reasoning...

Bug: hard + sync NFS mount creates indefinite I/O hang risk

What the bug is and how it manifests

Before this PR, the mount command used fg,hard,mountproto=tcp,... with no explicit async/sync flag, defaulting to NFS async mode. With async, writes complete as soon as they are buffered in the client page cache; the kernel retries delivery to the server in the background. This PR introduces an explicit sync option, which changes write semantics entirely: every write() syscall now blocks until the NFS proxy sends an acknowledgement. Combined with the pre-existing hard mount option (which means the client retries indefinitely and never returns an error to the application), any process performing write I/O on the mount point will block indefinitely if the NFS proxy becomes unavailable.

The specific code path

In packages/envd/internal/api/init.go (lines 235-256), nfsOptions is constructed with "sync" as the first entry. The setupNfs function then passes "fg,hard," + nfsOptions to the mount command. The hard flag was present before this PR; the sync flag is new. Together they produce a mount with both hard and sync active.

Why existing code does not prevent it

There is no timeout, watchdog, or sandbox-level health check that would detect and recover from a process stuck in D state. The NFS hard option is explicitly designed to prevent the kernel from ever returning EIO or ETIMEDOUT to the application. Adding sync on top means the kernel cannot complete the write optimistically by buffering; it must wait for the server. There is no timeo, retrans, or intr option to bound the wait.

Impact

If the NFS proxy crashes, is restarted, or a network partition occurs after a volume is mounted — all plausible in a sandbox environment — every sandbox process that subsequently attempts a write will enter uninterruptible sleep (D state). These processes cannot be killed with SIGKILL. The only recovery is a full VM termination. This turns a transient infrastructure hiccup into a permanent sandbox freeze.

Step-by-step proof of the hang

  1. Sandbox starts; setupNfs mounts the proxy volume with fg,hard,sync,...
  2. A sandbox user process opens a file on the mounted path and calls write().
  3. The Linux NFS client sends a WRITE RPC to the proxy and blocks the process, waiting for the NFS ACK (sync requires server acknowledgment before returning to userspace).
  4. At this moment, the NFS proxy crashes or becomes unreachable.
  5. The NFS client retries the WRITE RPC repeatedly per the hard semantics — indefinitely, with no timeout.
  6. The user process remains stuck in D state, unkillable. Any other process that subsequently writes to the same mount also gets stuck.
  7. The only recovery is to terminate the VM.

How to fix it

Replace hard with soft and add explicit timeo and retrans values: e.g., sync,soft,timeo=100,retrans=3,rsize=1048576,wsize=1048576,.... With soft, after retrans failed retries the kernel returns EIO to the application instead of retrying forever. sync can be kept to preserve the data-durability intent. The soft + timeo + retrans combination is the most portable and operationally safe choice.

Expand Down
Loading