fix: insight generation auth failure and test isolation by wesm · Pull Request #21 · wesm/agentsview

wesm · 2026-02-24T04:57:54Z

Summary

Strip ANTHROPIC_API_KEY and CLAUDECODE from the environment when invoking agent CLIs (claude, codex, gemini) for insight generation, forcing subscription-based auth. Also sets CLAUDE_NO_SOUND=1.
Fix test isolation: TestGenerateInsight_DefaultAgent was spawning a real claude process, which created session files that the file watcher synced into the production database. Introduce insight.GenerateFunc and Server.generateFunc (injectable via WithGenerateFunc option) so tests use a stub.
Show the agent name on in-progress insight tasks in the sidebar, matching completed entries.

Test plan

CGO_ENABLED=1 go test -tags fts5 ./internal/insight/ passes
CGO_ENABLED=1 go test -tags fts5 ./internal/server/ passes
go vet ./... clean
Manual: run insight generation with ANTHROPIC_API_KEY set in shell, verify Claude uses subscription auth
Manual: verify in-progress insight tasks show agent name in sidebar

🤖 Generated with Claude Code

Insight generation fails silently when ANTHROPIC_API_KEY is set because agent CLIs use the env var key instead of subscription auth. Add cleanEnv() that strips ANTHROPIC_API_KEY and CLAUDECODE from the environment and sets CLAUDE_NO_SOUND=1, applied to all three generator functions (claude, codex, gemini). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The in-progress task items in the insights list were missing the agent badge that completed entries display. Add a task-agent span between the task body and dismiss button, styled consistently with row-agent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

TestGenerateInsight_DefaultAgent was calling insight.Generate directly, which invokes `claude -p` if the CLI is on PATH. The spawned claude process creates a session file that the file watcher syncs into the production database. Add insight.GenerateFunc type and Server.generateFunc field (defaulting to insight.Generate) so tests can inject a stub. The test now uses a stub that returns an error, matching the existing assertion without spawning any subprocess. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- TestCleanEnv: use exact key/value map lookups instead of prefix matching to avoid false positives (e.g. CLAUDE_NO_SOUND=10) - WithGenerateFunc: guard against nil to prevent runtime panics - TestGenerateInsight_DefaultAgent: assert agent == "claude" to verify default-agent assignment is not silently regressed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T05:01:50Z

roborev: Combined Review (`fd3d133`)

Verdict: The changes introduce useful functionality but contain medium-severity security flaws related to environment variable leakage to child processes.

Medium

1. Credential Exposure via Environment Variable Denylist

File: internal/insight/generate.go:64-78

Description: cleanEnv() uses a denylist approach, copying almost the entire host process environment (os.Environ()) and explicitly filtering out only specific keys like ANTHROPIC_API_KEY and CLAUDECODE. This means all other environment variables—which may include sensitive data like cloud credentials
, database passwords, or other API keys—are blindly passed down to the AI agent child processes (claude, codex, gemini). Since these agents process potentially untrusted user input, an attacker could craft a malicious prompt instructing the AI to read and output its environment variables, leading to secret exfiltration.

Suggested Remediation: Adopt a strict allowlist approach instead of a denylist. Only pass the strictly necessary environment variables to the child processes (e.g., PATH, HOME, USER, LANG) along with any specific flags like CLAUDE_NO_SOUND=1.

2.
Case-Sensitive Environment Filtering Bypass on Windows

File: internal/insight/generate.go:70
Description: The secret stripping logic uses case-sensitive exact matches. On platforms with case-insensitive environment variables (notably Windows), a differently-cased key (e.g
., anthropic_api_key) can bypass the filtering and still be visible to the subprocesses.
Suggested Remediation: Normalize environment keys before comparison (e.g., strings.ToUpper(k)) or use case-insensitive matching (strings.EqualFold) when filtering sensitive variables.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

handleContextError was writing a 504 response on DeadlineExceeded, racing with http.TimeoutHandler's buffered 503 response. On Windows CI, Go's select randomly picked the handler's 504 over the middleware's 503, causing TestMiddleware_Timeout/GetMinimap to fail. Change handleContextError to return true without writing — same as it already does for Canceled — letting the withTimeout middleware handle the response consistently. Also consolidate search.go's inline context error handling to use handleContextError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The denylist approach leaked the entire host environment (cloud credentials, database passwords, other API keys) to agent CLI subprocesses. Switch to an allowlist of system vars the CLIs need (PATH, HOME, LANG, proxy settings, etc.) so only safe variables are passed through. This also eliminates the case-sensitivity bypass on Windows since the allowlist matches by known prefixes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T05:10:49Z

roborev: Combined Review (`17ae11b`)

Summary Verdict: The PR introduces a medium-severity security
risk due to incomplete environment variable scrubbing for spawned external CLIs.

Medium

Credential Exposure via Incomplete Environment Blocklist

File: internal/insight/generate.go, lines 65-78 (cleanEnv function)
Description: The cleanEnv function uses
a blocklist approach to remove ANTHROPIC_API_KEY and CLAUDECODE from the environment before spawning child processes (claude, codex, gemini). However, it inherits and passes all other environment variables from the parent process (os.Environ()) to these external CLIs
.
This has two negative impacts:
1. Credential Leakage: If the server environment contains sensitive variables (such as database credentials, internal server tokens, or API keys for other services), they are needlessly passed to third-party binaries. These external CLIs could potentially log them, include them in crash
  /telemetry reports, or leak them if the CLI itself is exploited.
2. Incomplete Fix: The goal is to ensure "CLIs use subscription auth, not env keys." By only blocklisting Anthropic's keys, the server will still leak OPENAI_API_KEY to the
  codex CLI and GEMINI_API_KEY to the gemini CLI if they are set on the host, defeating the purpose of the fix for those specific agents.
Suggested Remediation: The most secure practice for subprocesses is to use an allowlist instead of a blocklist.
Only pass the environment variables that are strictly necessary for the CLIs to execute (e.g., PATH, HOME, USER, and the newly added CLAUDE_NO_SOUND=1). If an allowlist approach breaks CLI functionality, significantly expand the blocklist to cover all relevant sensitive keys (e
.g., OPENAI_API_KEY, GEMINI_API_KEY, GOOGLE_API_KEY, OPENAI_ORG_ID, etc.).

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

Windows emits env keys with mixed casing (Path, ComSpec) that didn't match the uppercase allowlist prefixes. Parse the key separately and compare case-insensitively via strings.ToUpper. Prefix entries ending with _ (LC_, XDG_, SSL_CERT_) match any key starting with that prefix; all others require exact match. Add TestEnvKeyAllowed with table-driven cases covering Windows-style casing and unknown var exclusion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

review_guidelines was wrapped in a [review_guidelines] section header, making it a TOML table instead of a top-level string. The threat_model key inside it was also wrong — the field is review_guidelines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T05:19:57Z

roborev: Combined Review (`6e43d2c`)

Review complete: 1 Medium severity issue identified regarding environment variable filtering; no security vulnerabilities detected.

Medium

Regression risk (agent auth via env vars may break)
*
Files: internal/insight/generate.go:69, [internal/insight/generate.go:101](/home/roborev/.rob
orev/clones/wesm/agentsview/internal/insight/generate.go:101)*

cleanEnv() now drops all non-allowlisted vars, including common credential vars (ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY , etc.). That improves secret isolation, but it can break users who authenticate CLIs via env vars.
Suggested fix: Add per-agent credential passthrough (only for the selected agent), or make credential passthrough explicitly configurable.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

The previous approach of suppressing writes in handleContextError could produce implicit 200s if a handler were ever called without withTimeout. Restore the 504 write for DeadlineExceeded so handlers always produce a concrete error response. The middleware test race (503 vs 504) was caused by using an already-expired context, making Go's select pick randomly between the handler and middleware channels. Fix by accepting either 503 or 504 in middleware tests, since both are valid timeout responses. The internal test (no middleware) deterministically expects 504. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T05:27:35Z

roborev: Combined Review (`9ef6293`)

Verdict: The changeset
is secure and generally clean, with one medium-severity regression risk identified.

Medium

internal/insight/generate.go#L69
- Problem: The env allowlist omits common Windows profile/config vars like APPDATA and LOCALAPPDATA. Some CLIs resolve auth/config/cache paths via those vars, so insight generation can fail on Windows even though casing is handled.
- Suggested fix: Add
  APPDATA and LOCALAPPDATA (optionally PROGRAMDATA) to allowedKeyPrefixes, and add tests for those keys in TestEnvKeyAllowed.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

Windows CLIs resolve auth/config/cache paths via these vars. Without them, insight generation can fail on Windows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add assertTimeoutRace helper that checks status code, Content-Type, and error body based on which path won: 503 expects "request timed out", 504 expects "gateway timeout" with application/json. Replaces bare status-code-only checks in TestMiddleware_Timeout and TestSearch_DeadlineExceeded. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T12:40:37Z

roborev: Combined Review (`c7aca52`)

Synthesis unavailable. Showing raw review outputs.

Agent: codex | Type: security | Status: done

Reviewing the 11-commit combined diff with a security-first pass. I’ll map each potential issue to exact file/line references in the current tree and then report only security-relevant findings by severity.
I found the working tree is at a different code snapshot than the provided diff, so I’m validating directly against the commit range via git history to preserve accurate line references for this review.
Summary:

Reviewed the combined diff across all 11 commits for injection, authz/authn, credential exposure, path traversal, unsafe execution/deserialization patterns, dependency/CI risk, data leakage, concurrency-related security impact, and error-handling leakage.
No new exploitable security vulnerabilities were identified in the changes shown.

No issues found.

Agent: gemini | Type: security | Status: failed

Error: Review failed. Check CI logs for details.

Agent: codex | Type: default | Status: done

Reviewing this commit range as a code review pass focused on correctness, security, regressions, and test coverage. I’ll inspect the modified server and insight code paths first, then validate whether the new tests actually protect the intended behavior.
The repo layout differs from the paths in the diff, so I’m locating the actual file locations before continuing the review to avoid missing context.
I’ve narrowed in on the timeout/error-handling and injectable insight-generation changes now. Next I’m reading the timeout middleware and related handlers to confirm whether the new “503 or 504” test contract can mask real regressions.
I found a likely testing-coverage concern around the new timeout race assertion; I’m checking whether other middleware tests still enforce the 503 JSON contract so I can confirm if this is a real gap or already covered elsewhere.
I confirmed the referenced commits exist in history, so I’m now reading the exact range directly from git to verify there aren’t additional changes outside the pasted diff before finalizing findings.
I’ve gathered exact line references from the final commit state. I’m doing one last pass for concrete, non-speculative issues before I return the findings list.
Summary: This range hardens insight subprocess execution by switching to an env allowlist, adds test-time injection for insight generation to avoid invoking real CLIs, updates timeout-path behavior/tests for 503 vs 504 races, fixes .roborev.toml schema shape, and adds agent labels in the insights UI.

Issues found:

Severity: medium
File: internal/insight/generate.go:69
The new env allowlist is likely incomplete for Windows subprocess behavior (PATHEXT and WINDIR are not passed). Some CLIs or child process launches on Windows rely on these, so this can cause agent CLI failures/regressions in Windows environments.
Suggested fix: include essential Windows runtime vars (PATHEXT, WINDIR, and potentially HOMEDRIVE/HOMEPATH) in the allowlist, then add a Windows-specific test covering CLI invocation under sanitized env.
Severity: low
File: internal/server/server_test.go:365
assertTimeoutRace validates JSON Content-Type only for 504, not for 503. That weakens contract coverage for timeout middleware responses and can miss regressions where 503 returns non-JSON.
Suggested fix: in the 503 branch, also assert Content-Type == application/json and validate/parsing of {"error":"request timed out"}.
Severity: low
File: internal/insight/generate_test.go:172
TestCleanEnv asserts exact-case keys (PATH, HOME, etc.) from os.Environ(). On Windows, env keys are case-insensitive and often returned with different casing (Path), so this test can be brittle or misleading cross-platform.
Suggested fix: normalize env map keys to upper-case before assertions, and/or assert with case-insensitive lookup.

Agent: gemini | Type: default | Status: failed

Error: Review failed. Check CI logs for details.

…est keys - Add PATHEXT, WINDIR, HOMEDRIVE, HOMEPATH to env allowlist so agent CLIs can resolve executables and config paths on Windows - assertTimeoutRace now checks Content-Type for both 503 and 504 - TestCleanEnv normalizes env map keys to uppercase for cross-platform correctness (Windows returns Path not PATH) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T12:50:50Z

roborev: Combined Review (`67e31a4`)

Synthesis unavailable. Showing raw review outputs.

Agent: codex | Type: security | Status: done

Summary: I reviewed the full diff across the 12 commits with focus on injection, auth/access control, credential handling, path traversal, unsafe patterns, CI/CD risk, data exposure, concurrency, and error handling. The key security-relevant change (internal/insight/generate.go) moves from env denylisting to allowlisting and reduces secret leakage to agent subprocesses; I did not identify a new exploitable vulnerability in the introduced code.

No issues found.

Agent: gemini | Type: security | Status: failed

Error: Review failed. Check CI logs for details.

Agent: codex | Type: default | Status: done

Reviewing the provided commit-range diff directly for correctness, security, regressions, and test coverage. I’ll inspect the touched server/insight paths for behavior changes and then report findings with severity and file references.
I hit path mismatches in this checkout, so I’m locating the corresponding files before continuing the diff review to ensure line references are accurate.
I’m now checking for subtle regressions from the env allowlist and timeout-race handling, since those are the highest-risk behavior changes in this diff.
Brief summary: these commits harden agent subprocess environment handling (denylist -> allowlist, Windows casing/runtime vars), make insight generation injectable for tests, adjust timeout/deadline handling to tolerate 503/504 race behavior, surface agent name in the insights UI, and fix .roborev.toml schema shape.

Medium - Potential auth regression from strict env allowlist
File: internal/insight/generate.go:61, internal/insight/generate.go:109, internal/insight/generate.go:123
cleanEnv() now strips all non-allowlisted vars before launching claude/codex/gemini. That also strips common env-based auth vars (for example OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY), which can break users who authenticate CLIs via environment instead of local credential stores.
Suggested fix: allow explicit pass-through for agent-auth env vars (or an opt-in config), and document the behavior.
Low - API error-message contract changed for search timeouts
File: internal/server/search.go:64, internal/server/response.go:30
handleSearch moved from endpoint-specific "search timed out" to shared "gateway timeout" via handleContextError. Status is still correct, but clients/tests keyed to message text may regress.
Suggested fix: keep endpoint-specific timeout text in handleSearch, or codify that clients must rely on HTTP status (not error text).
Low (testing gap) - Missing behavioral test around real generator env requirements
File: internal/insight/generate_test.go:158
Tests validate key filtering logic, but there’s no integration-style check that each generator path (generateClaude, generateCodex, generateGemini) still has all required runtime env on each OS after filtering.
Suggested fix: add generator-level tests (with exec stubs) asserting expected env composition per command, plus at least one Windows CI validation path.

Agent: gemini | Type: default | Status: failed

Error: Review failed. Check CI logs for details.

Remove cmd.Env = cleanEnv() from all three generator functions. The cleanEnv helper and its tests are retained for potential future use but no longer applied to agent CLI subprocesses. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T13:02:48Z

roborev: Combined Review (`363de8d`)

Summary Verdict: The codebase has one medium severity issue regarding unused environment variable filtering that exposes subprocesses to sensitive
credentials.

Medium

Env allowlist is implemented but not enforced in subprocess execution

Location: internal/insight/generate.go (lines 64, 103, 122, 164, 293)
Description: The
cleanEnv function and its associated allowlist (allowedKeyPrefixes, envKeyAllowed) are defined to prevent sensitive credentials (e.g., ANTHROPIC_API_KEY) from being leaked to agent CLI subprocesses. However, these functions are never used to sanitize the environment. Because the exec. Cmd objects constructed to invoke the agent CLIs do not have their Env field set, they default to inheriting the entire unrestricted environment of the parent process. This defeats the intended security control, exposing all environment variables to third-party tools, and leaves dead code in the file.
Suggested Remediation:
Apply cleanEnv() to the Env field of the exec.Cmd when constructing the commands for the agent CLIs (e.g., add cmd.Env = cleanEnv()). Alternatively, if full-environment inheritance is intentional, remove the unused functions and their associated tests.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

wesm and others added 4 commits February 23, 2026 22:49

wesm and others added 2 commits February 23, 2026 23:06

wesm and others added 2 commits February 23, 2026 23:14

wesm and others added 2 commits February 24, 2026 06:34

fix: add APPDATA, LOCALAPPDATA, PROGRAMDATA to env allowlist

8f5b272

Windows CLIs resolve auth/config/cache paths via these vars. Without them, insight generation can fail on Windows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

wesm merged commit d17bb69 into main Feb 24, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: insight generation auth failure and test isolation#21

fix: insight generation auth failure and test isolation#21
wesm merged 13 commits intomainfrom
insights-claude-failure

wesm commented Feb 24, 2026 •

edited

Loading

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wesm commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (fd3d133)

Medium

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (17ae11b)

Medium

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (6e43d2c)

Medium

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (9ef6293)

Medium

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (c7aca52)

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (67e31a4)

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (363de8d)

Medium

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wesm commented Feb 24, 2026 •

edited

Loading

roborev: Combined Review (`fd3d133`)

roborev: Combined Review (`17ae11b`)

roborev: Combined Review (`6e43d2c`)

roborev: Combined Review (`9ef6293`)

roborev: Combined Review (`c7aca52`)

roborev: Combined Review (`67e31a4`)

roborev: Combined Review (`363de8d`)