Skip to content

Commit 7897cdd

Browse files
Fix foreground commands blocking on background processes (#185)
* Fix foreground commands blocking on background processes Foreground commands (exec, listFiles) would hang until background processes completed. This was caused by wait without arguments in the foreground pattern, which waits for ALL shell child processes. Bash waits for process substitutions automatically, so the explicit wait was unnecessary. Removing it fixes the hang. * Add changeset * Fix race condition in foreground exec pattern Process substitutions write asynchronously - bash returns when substitution processes close, but writes may still be buffered. With large output (base64-encoded files), the log could be incomplete when read, causing flaky CI failures. Replace process substitutions with direct file redirects. Bash waits for redirects to complete, ensuring logs are fully written before exit codes are published.
1 parent 17d2a4d commit 7897cdd

File tree

4 files changed

+82
-19
lines changed

4 files changed

+82
-19
lines changed

.changeset/neat-drinks-own.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"@cloudflare/sandbox": patch
3+
---
4+
5+
Fix foreground commands blocking on background processes

docs/SESSION_EXECUTION.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -14,22 +14,20 @@ This document explains how the container session executes commands reliably whil
1414
### Foreground (`exec`)
1515

1616
- Runs in the main bash shell so state persists across commands.
17-
- Uses bash process substitution to prefix stdout/stderr inline and append to the per-command log file.
18-
- After the command returns, we `wait` to ensure process-substitution consumers finish writing to the log before we write the exit code file.
19-
- Why process substitution (not FIFOs):
20-
- Foreground previously used FIFOs + background labelers, which can race on silent commands because FIFO open/close ordering depends on cross-process scheduling.
21-
- Process substitution keeps execution local to the main shell and avoids FIFO semantics entirely.
17+
- Writes stdout/stderr to temporary files, then prefixes and merges them into the log.
18+
- Bash waits for file redirects to complete before continuing, ensuring the log is fully written before the exit code is published.
19+
- This avoids race conditions from process substitution buffering where log reads could happen before writes complete.
2220

2321
Pseudo:
2422

2523
```
2624
# Foreground
27-
{ command; } \
28-
> >(while read; printf "\x01\x01\x01%s\n" "$REPLY" >> "$log") \
29-
2> >(while read; printf "\x02\x02\x02%s\n" "$REPLY" >> "$log")
25+
{ command; } > "$log.stdout" 2> "$log.stderr"
3026
EXIT_CODE=$?
31-
# Ensure consumers have drained
32-
wait 2>/dev/null
27+
# Prefix and merge into main log
28+
(while read line; do printf "\x01\x01\x01%s\n" "$line"; done < "$log.stdout") >> "$log"
29+
(while read line; do printf "\x02\x02\x02%s\n" "$line"; done < "$log.stderr") >> "$log"
30+
rm -f "$log.stdout" "$log.stderr"
3331
# Atomically publish exit code
3432
echo "$EXIT_CODE" > "$exit.tmp" && mv "$exit.tmp" "$exit"
3533
```
@@ -97,3 +95,5 @@ mkfifo "$sp" "$ep"
9795
- Why not tee? Tee doesn’t split stdout/stderr into separate channels with stable ordering without extra plumbing; our prefixes are simple and explicit.
9896
- Is process substitution portable?
9997
- It is supported by bash (we spawn bash with `--norc`). The container environment supports it; if portability constraints change, we can revisit.
98+
- Why use temp files instead of process substitution for foreground?
99+
Process substitutions run asynchronously - bash returns when the substitution processes close, but their writes to the log file may still be buffered. With large output (e.g., base64-encoded files), the log file can be incomplete when we try to read it. Using direct file redirects ensures bash waits for all writes to complete before continuing, eliminating this race condition.

packages/sandbox-container/src/session.ts

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -722,17 +722,17 @@ export class Session {
722722
// FOREGROUND PATTERN (for exec)
723723
// Command runs in main shell, state persists!
724724

725-
// FOREGROUND: Avoid FIFOs to eliminate race conditions.
726-
// Use bash process substitution to prefix stdout/stderr while keeping
727-
// execution in the main shell so session state persists across commands.
725+
// FOREGROUND: Write stdout/stderr to temp files, then prefix and merge.
726+
// This ensures bash waits for all writes to complete before continuing,
727+
// avoiding race conditions when reading the log file.
728728

729729
if (cwd) {
730730
const safeCwd = this.escapeShellPath(cwd);
731731
script += ` # Save and change directory\n`;
732732
script += ` PREV_DIR=$(pwd)\n`;
733733
script += ` if cd ${safeCwd}; then\n`;
734-
script += ` # Execute command with prefixed streaming via process substitution\n`;
735-
script += ` { ${command}; } < /dev/null > >(while IFS= read -r line || [[ -n "$line" ]]; do printf '\\x01\\x01\\x01%s\\n' "$line"; done >> "$log") 2> >(while IFS= read -r line || [[ -n "$line" ]]; do printf '\\x02\\x02\\x02%s\\n' "$line"; done >> "$log")\n`;
734+
script += ` # Execute command, redirect to temp files\n`;
735+
script += ` { ${command}; } < /dev/null > "$log.stdout" 2> "$log.stderr"\n`;
736736
script += ` EXIT_CODE=$?\n`;
737737
script += ` # Restore directory\n`;
738738
script += ` cd "$PREV_DIR"\n`;
@@ -741,13 +741,16 @@ export class Session {
741741
script += ` EXIT_CODE=1\n`;
742742
script += ` fi\n`;
743743
} else {
744-
script += ` # Execute command with prefixed streaming via process substitution\n`;
745-
script += ` { ${command}; } < /dev/null > >(while IFS= read -r line || [[ -n "$line" ]]; do printf '\\x01\\x01\\x01%s\\n' "$line"; done >> "$log") 2> >(while IFS= read -r line || [[ -n "$line" ]]; do printf '\\x02\\x02\\x02%s\\n' "$line"; done >> "$log")\n`;
744+
script += ` # Execute command, redirect to temp files\n`;
745+
script += ` { ${command}; } < /dev/null > "$log.stdout" 2> "$log.stderr"\n`;
746746
script += ` EXIT_CODE=$?\n`;
747747
}
748748

749-
// Ensure process-substitution consumers complete before writing exit code
750-
script += ` wait 2>/dev/null\n`;
749+
script += ` \n`;
750+
script += ` # Prefix and merge stdout/stderr into main log\n`;
751+
script += ` (while IFS= read -r line || [[ -n "$line" ]]; do printf '\\x01\\x01\\x01%s\\n' "$line"; done < "$log.stdout" >> "$log") 2>/dev/null\n`;
752+
script += ` (while IFS= read -r line || [[ -n "$line" ]]; do printf '\\x02\\x02\\x02%s\\n' "$line"; done < "$log.stderr" >> "$log") 2>/dev/null\n`;
753+
script += ` rm -f "$log.stdout" "$log.stderr"\n`;
751754
script += ` \n`;
752755
script += ` # Write exit code\n`;
753756
script += ` echo "$EXIT_CODE" > ${safeExitCodeFile}.tmp\n`;

tests/e2e/process-lifecycle-workflow.test.ts

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,61 @@ describe('Process Lifecycle Workflow', () => {
186186
});
187187
}, 90000);
188188

189+
test('should not block foreground operations when background processes are running', async () => {
190+
const sandboxId = createSandboxId();
191+
const headers = createTestHeaders(sandboxId);
192+
193+
// Start a long-running background process
194+
const startResponse = await vi.waitFor(
195+
async () =>
196+
fetchWithStartup(`${workerUrl}/api/process/start`, {
197+
method: 'POST',
198+
headers,
199+
body: JSON.stringify({
200+
command: 'sleep 60'
201+
})
202+
}),
203+
{ timeout: 90000, interval: 2000 }
204+
);
205+
206+
const startData = await startResponse.json();
207+
const processId = startData.id;
208+
209+
// Immediately run a foreground command - should complete quickly
210+
const execStart = Date.now();
211+
const execResponse = await fetch(`${workerUrl}/api/execute`, {
212+
method: 'POST',
213+
headers,
214+
body: JSON.stringify({
215+
command: 'echo "test"'
216+
})
217+
});
218+
const execDuration = Date.now() - execStart;
219+
220+
expect(execResponse.status).toBe(200);
221+
expect(execDuration).toBeLessThan(2000); // Should complete in <2s, not wait for sleep
222+
223+
// Test listFiles as well - it uses the same foreground execution path
224+
const listStart = Date.now();
225+
const listResponse = await fetch(`${workerUrl}/api/list-files`, {
226+
method: 'POST',
227+
headers,
228+
body: JSON.stringify({
229+
path: '/workspace'
230+
})
231+
});
232+
const listDuration = Date.now() - listStart;
233+
234+
expect(listResponse.status).toBe(200);
235+
expect(listDuration).toBeLessThan(2000); // Should complete quickly
236+
237+
// Cleanup
238+
await fetch(`${workerUrl}/api/process/${processId}`, {
239+
method: 'DELETE',
240+
headers
241+
});
242+
}, 90000);
243+
189244
test('should get process logs after execution', async () => {
190245
const sandboxId = createSandboxId();
191246
const headers = createTestHeaders(sandboxId);

0 commit comments

Comments
 (0)