fix(install): pass security scanner package data via stdin pipe#27717
fix(install): pass security scanner package data via stdin pipe#27717
Conversation
The security scanner subprocess received the entire package manifest JSON embedded in the `-e` command-line argument. With ~750+ packages, this exceeded Linux's MAX_ARG_STRLEN limit (128KB per argument), causing `posix_spawn` to fail with E2BIG. The error was silently swallowed, making `bun install` exit with code 1 and no error message, lockfile, or node_modules. Fix: Pass the package JSON via a stdin pipe instead of inlining it in the -e argument. Also add an error message for unhandled scanner errors so failures are no longer silent. Closes #27716 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
WalkthroughThese changes address security scanner failures during large-scale package installations by hardening error handling, transitioning from command-line to stdin-based JSON data transfer to bypass argument buffer limits, and implementing runtime package input parsing instead of static placeholders. A regression test validates the fix. Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/install/PackageManager/security_scanner.zig`:
- Around line 716-727: The writeAllToPipe function currently swallows errors
from bun.sys.write and returns early on write errors or zero bytes written, so
modify writeAllToPipe to return an error or bool (e.g., !void or bool) instead
of void, propagate any .err from bun.sys.write and treat written == 0 as a
failure, and update the caller spawn to check the returned success/error and
handle/propagate the failure (cleanup, close pipe, and surface an error) so
callers can detect incomplete writes; reference the writeAllToPipe function and
the spawn caller when making these changes.
In `@test/regression/issue/27716.test.ts`:
- Around line 15-53: Replace manual beforeAll/afterAll cleanup with a
using-based resource scope for Bun.serve: instead of storing server in the
top-level server variable and calling server?.stop(true) in afterAll, acquire
the server via an await using (or using) so the Bun.serve return value is
disposed automatically; update the setup so registryUrl still reads server.port
after creation and remove the afterAll teardown. Reference the Bun.serve call
and the server variable to locate where to switch to using/await using.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
src/install/PackageManager/install_with_manager.zigsrc/install/PackageManager/scanner-entry.tssrc/install/PackageManager/security_scanner.zigtest/regression/issue/27716.test.ts
| fn writeAllToPipe(fd: bun.FileDescriptor, data: []const u8) void { | ||
| var remaining = data; | ||
| while (remaining.len > 0) { | ||
| switch (bun.sys.write(fd, remaining)) { | ||
| .result => |written| { | ||
| if (written == 0) return; | ||
| remaining = remaining[written..]; | ||
| }, | ||
| .err => return, | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
writeAllToPipe silently ignores write errors.
The function returns without indication when bun.sys.write fails. This could mask issues during large data transfer (e.g., broken pipe, disk quota). The caller (spawn) has no way to detect that the child received incomplete data.
Consider returning a boolean or error indicator so the caller can handle failures appropriately.
🔧 Proposed fix to return success/failure
-fn writeAllToPipe(fd: bun.FileDescriptor, data: []const u8) void {
+fn writeAllToPipe(fd: bun.FileDescriptor, data: []const u8) bool {
var remaining = data;
while (remaining.len > 0) {
switch (bun.sys.write(fd, remaining)) {
.result => |written| {
- if (written == 0) return;
+ if (written == 0) return false;
remaining = remaining[written..];
},
- .err => return,
+ .err => return false,
}
}
+ return true;
}Then in spawn:
- writeAllToPipe(stdin_pipe_fds[1], this.json_data);
+ if (!writeAllToPipe(stdin_pipe_fds[1], this.json_data)) {
+ stdin_pipe_fds[1].close();
+ return error.StdinWriteFailed;
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/install/PackageManager/security_scanner.zig` around lines 716 - 727, The
writeAllToPipe function currently swallows errors from bun.sys.write and returns
early on write errors or zero bytes written, so modify writeAllToPipe to return
an error or bool (e.g., !void or bool) instead of void, propagate any .err from
bun.sys.write and treat written == 0 as a failure, and update the caller spawn
to check the returned success/error and handle/propagate the failure (cleanup,
close pipe, and surface an error) so callers can detect incomplete writes;
reference the writeAllToPipe function and the spawn caller when making these
changes.
| let server: ReturnType<typeof Bun.serve>; | ||
| let registryUrl: string; | ||
|
|
||
| beforeAll(() => { | ||
| server = Bun.serve({ | ||
| port: 0, | ||
| async fetch(req) { | ||
| const url = new URL(req.url); | ||
| const path = url.pathname; | ||
|
|
||
| if (path.endsWith(".tgz")) { | ||
| return new Response(tgzData); | ||
| } | ||
|
|
||
| // Package metadata request | ||
| const name = decodeURIComponent(path.slice(1)); | ||
| return new Response( | ||
| JSON.stringify({ | ||
| name, | ||
| versions: { | ||
| "1.0.0": { | ||
| name, | ||
| version: "1.0.0", | ||
| dist: { | ||
| tarball: `${registryUrl}${name}-1.0.0.tgz`, | ||
| }, | ||
| }, | ||
| }, | ||
| "dist-tags": { latest: "1.0.0" }, | ||
| }), | ||
| ); | ||
| }, | ||
| }); | ||
| registryUrl = `http://localhost:${server.port}/`; | ||
| }); | ||
|
|
||
| afterAll(() => { | ||
| server?.stop(true); | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider using using for server cleanup to ensure proper resource disposal.
The coding guidelines recommend using using or await using for APIs like Bun.serve to ensure proper resource cleanup. The current afterAll approach may miss cleanup if a test fails before reaching afterAll.
♻️ Proposed refactor using `using` pattern
-let server: ReturnType<typeof Bun.serve>;
-let registryUrl: string;
-
-beforeAll(() => {
- server = Bun.serve({
+test("security scanner works with many packages", async () => {
+ using server = Bun.serve({
port: 0,
async fetch(req) {
// ... existing fetch logic
},
});
- registryUrl = `http://localhost:${server.port}/`;
-});
-
-afterAll(() => {
- server?.stop(true);
-});
+ const registryUrl = `http://localhost:${server.port}/`;
-test("security scanner works with many packages", async () => {
using dir = tempDir("issue-27716", {
// ... rest of test
});
+ // ... rest of test using registryUrl
+}, 60_000);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/regression/issue/27716.test.ts` around lines 15 - 53, Replace manual
beforeAll/afterAll cleanup with a using-based resource scope for Bun.serve:
instead of storing server in the top-level server variable and calling
server?.stop(true) in afterAll, acquire the server via an await using (or using)
so the Bun.serve return value is disposed automatically; update the setup so
registryUrl still reads server.port after creation and remove the afterAll
teardown. Reference the Bun.serve call and the server variable to locate where
to switch to using/await using.
|
We cannot use Stdin because security scanners read from it to do interactive setups. I have an existing PR for this that I will get an agent to revisit and work on again. The existing PR worked on posix but had problems on windows. |
| const stdin_pipe_result = bun.sys.pipe(); | ||
| const stdin_pipe_fds = switch (stdin_pipe_result) { | ||
| .err => { | ||
| return error.StdinPipeFailed; | ||
| }, | ||
| .result => |fds| fds, |
There was a problem hiding this comment.
🟡 Two minor error-handling issues in the new stdin pipe code: (1) If pipe() for stdin fails at line 760, the already-created IPC pipe_fds are leaked since there's no errdefer to close them. Similarly, if spawnProcess fails, both pipe pairs (4 FDs) leak. (2) writeAllToPipe silently returns void on write errors, so a broken pipe produces a confusing downstream JSON parse error instead of a clear pipe write failure message. Both are mitigated by the process exiting shortly after, but adding cleanup and at least a debug log would improve robustness.
Extended reasoning...
FD leak on stdin pipe creation failure
In SecurityScanSubprocess.spawn(), the IPC pipe is created at lines 749-755 via bun.sys.pipe(). If the second pipe() call for stdin at line 760 fails, the function returns error.StdinPipeFailed without closing pipe_fds[0] and pipe_fds[1]. There are no errdefer statements anywhere in the function to handle this cleanup. The same pattern applies if spawnProcess fails at line 795 — all four file descriptors (both pipe_fds and stdin_pipe_fds) would leak.
The idiomatic Zig fix would be to add errdefer statements after each successful pipe() call:
const pipe_fds = switch (pipe_result) { ... };
errdefer { pipe_fds[0].close(); pipe_fds[1].close(); }And similarly for stdin_pipe_fds.
Concrete example of the FD leak
Step-by-step: (1) bun.sys.pipe() succeeds for IPC, creating pipe_fds[0] and pipe_fds[1]. (2) bun.sys.pipe() for stdin fails (e.g., system is near the FD limit). (3) The function returns error.StdinPipeFailed. (4) pipe_fds[0] and pipe_fds[1] are never closed — they are leaked until the process exits. In practice, the caller catches the error and calls Global.exit(1), so the OS reclaims the FDs almost immediately, but it's still not correct resource management.
Silent write errors in writeAllToPipe
The writeAllToPipe function (line 713-727) returns void and silently discards write errors. If the child process dies before reading all stdin data (e.g., crashes during startup), write() will return EPIPE or similar, and the function silently returns. The child then receives truncated JSON, fails to parse it, and reports a generic SCAN_FAILED error via IPC (or exits with a non-zero code). The parent sees "Security scanner sent invalid JSON" or "Security scanner failed" rather than "pipe write to child stdin failed". This is ironic given that the PR description notes the old code "silently swallowed" errors.
Impact assessment
Both issues are low-impact in practice: (1) pipe() failing is extremely rare and only happens at the OS FD limit, and the process exits immediately after. (2) Write failures are still detected indirectly through the child's exit status or IPC error, so the system doesn't silently succeed — it just produces a less helpful error message. The fix for both is straightforward: add errdefer for FD cleanup, and have writeAllToPipe return an error (or at least emit a debug log) so the root cause is visible.
Pre-existing pattern note
The IPC pipe already had a similar leak pattern before this PR (if spawnProcess failed, pipe_fds would leak). However, this PR introduces a new leak path: stdin pipe failure after IPC pipe success. The writeAllToPipe issue is entirely new to this PR.
| expect(existsSync(join(String(dir), "bun.lock"))).toBe(true); | ||
| expect(existsSync(join(String(dir), "node_modules"))).toBe(true); | ||
| expect(exitCode).toBe(0); | ||
| }, 60_000); |
There was a problem hiding this comment.
🟡 The test sets an explicit 60,000ms timeout at line 95 (}, 60_000);), which violates the test/CLAUDE.md rule: "CRITICAL: Do not set a timeout on tests. Bun already has timeouts." The timeout should be removed.
Extended reasoning...
What the bug is
The newly introduced regression test at test/regression/issue/27716.test.ts line 95 passes a second argument of 60_000 to the test() call, setting an explicit 60-second timeout on the test case.
The rule it violates
The test/CLAUDE.md file explicitly states under its testing guidelines: "CRITICAL: Do not set a timeout on tests. Bun already has timeouts." This is a clear, unambiguous convention for the repository. The project relies on Bun's built-in test timeout mechanism rather than per-test explicit timeouts.
Step-by-step proof
Looking at the test file, line 95 reads:
}, 60_000);
This is the closing of the test("security scanner works with many packages", async () => { ... }, 60_000) call. The 60_000 is the explicit timeout parameter passed to bun:test's test() function, overriding whatever default timeout Bun provides.
Impact
This is a style/convention violation rather than a functional bug. The test will still work correctly with or without the explicit timeout. However, it sets a precedent that could lead to inconsistent timeout practices across the test suite, which is exactly what the CLAUDE.md rule is designed to prevent.
How to fix
Simply remove the , 60_000 argument from line 95, changing:
}, 60_000);to:
});Bun's built-in test timeout will handle the case where the test runs too long.
|
|
||
| const scannerModuleName = "__SCANNER_MODULE__"; | ||
| const packages = __PACKAGES_JSON__; | ||
| const packages = JSON.parse(fs.readFileSync(0, "utf-8")); |
There was a problem hiding this comment.
🟡 Nit: scanner-entry-globals.d.ts still declares __PACKAGES_JSON__ on line 1, but this PR removed all usage of that global from scanner-entry.ts (replaced with fs.readFileSync(0)) and removed the placeholder substitution from security_scanner.zig. The stale type declaration should be removed to keep the .d.ts file consistent with the actual code.
Extended reasoning...
What the bug is
The file src/install/PackageManager/scanner-entry-globals.d.ts line 1 declares:
declare const __PACKAGES_JSON__: Bun.Security.Package[];This global type declaration was used to provide type information for the __PACKAGES_JSON__ placeholder that was previously embedded in scanner-entry.ts. The PR replaced this mechanism with reading package data from stdin via fs.readFileSync(0, "utf-8"), making the declaration stale.
The specific code path
Previously, security_scanner.zig had code that found the __PACKAGES_JSON__ placeholder in the embedded scanner-entry.ts source and replaced it with the actual JSON data at runtime. The .d.ts file existed so TypeScript tooling could type-check scanner-entry.ts during development, providing the type for the injected global.
This PR removed: (1) the __PACKAGES_JSON__ usage from scanner-entry.ts line 4 (now JSON.parse(fs.readFileSync(0, "utf-8"))), and (2) the placeholder replacement block from security_scanner.zig. However, scanner-entry-globals.d.ts was not updated.
Why existing code does not prevent it
The .d.ts file is only used for development-time type checking and is not included in the build (the .ts source is embedded via @embedFile). So there is no build error or runtime failure from the stale declaration — it simply goes unnoticed.
Impact
This is a minor consistency issue. Future developers reading scanner-entry-globals.d.ts would see __PACKAGES_JSON__ declared as a global and might be confused about where it comes from, since it is no longer used anywhere in the codebase. It could also lead to someone mistakenly thinking the global injection mechanism still exists.
How to fix
Remove line 1 from scanner-entry-globals.d.ts:
-declare const __PACKAGES_JSON__: Bun.Security.Package[];
declare const __SUPPRESS_ERROR__: boolean;Step-by-step proof
scanner-entry.tsline 4 now reads:const packages = JSON.parse(fs.readFileSync(0, "utf-8"));— no reference to__PACKAGES_JSON__.- The
security_scanner.zigdiff shows the__PACKAGES_JSON__placeholder replacement block (lines 676-685 in the old file) was deleted entirely. scanner-entry-globals.d.tsline 1 still declaresdeclare const __PACKAGES_JSON__: Bun.Security.Package[];.- Searching the codebase confirms no remaining references to
__PACKAGES_JSON__other than this stale declaration.
Summary
bun installsilently failing (exit 1, no error, no lockfile) when a security scanner is configured and the project has ~750+ packages-ecommand-line argument. With many packages, this exceeded Linux'sMAX_ARG_STRLENlimit (128KB per argument), causingposix_spawnto fail withE2BIGelse => {}branch, so users saw no error messageChanges
security_scanner.zig: Pass package JSON via a stdin pipe instead of inlining it in the-eargument. The child process reads from fd 0 (stdin) instead.scanner-entry.ts: Read packages from stdin (fs.readFileSync(0, "utf-8")) instead of an inline__PACKAGES_JSON__placeholder.install_with_manager.zig: Add error message for unhandled scanner errors so failures are no longer silent.Test plan
test/regression/issue/27716.test.ts— installs 850 packages with a security scanner configuredbun-install-security-provider.test.ts)bun-security-scanner-workspaces.test.ts)bun-update-security-provider.test.ts,bun-update-security-simple.test.ts)Closes #27716
🤖 Generated with Claude Code