Skip to content

fix: insight generation auth failure and test isolation#21

Merged
wesm merged 13 commits intomainfrom
insights-claude-failure
Feb 24, 2026
Merged

fix: insight generation auth failure and test isolation#21
wesm merged 13 commits intomainfrom
insights-claude-failure

Conversation

@wesm
Copy link
Copy Markdown
Owner

@wesm wesm commented Feb 24, 2026

Summary

  • Strip ANTHROPIC_API_KEY and CLAUDECODE from the environment when invoking agent CLIs (claude, codex, gemini) for insight generation, forcing subscription-based auth. Also sets CLAUDE_NO_SOUND=1.
  • Fix test isolation: TestGenerateInsight_DefaultAgent was spawning a real claude process, which created session files that the file watcher synced into the production database. Introduce insight.GenerateFunc and Server.generateFunc (injectable via WithGenerateFunc option) so tests use a stub.
  • Show the agent name on in-progress insight tasks in the sidebar, matching completed entries.

Test plan

  • CGO_ENABLED=1 go test -tags fts5 ./internal/insight/ passes
  • CGO_ENABLED=1 go test -tags fts5 ./internal/server/ passes
  • go vet ./... clean
  • Manual: run insight generation with ANTHROPIC_API_KEY set in shell, verify Claude uses subscription auth
  • Manual: verify in-progress insight tasks show agent name in sidebar

🤖 Generated with Claude Code

wesm and others added 4 commits February 23, 2026 22:49
Insight generation fails silently when ANTHROPIC_API_KEY is set because
agent CLIs use the env var key instead of subscription auth. Add
cleanEnv() that strips ANTHROPIC_API_KEY and CLAUDECODE from the
environment and sets CLAUDE_NO_SOUND=1, applied to all three generator
functions (claude, codex, gemini).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The in-progress task items in the insights list were missing the agent
badge that completed entries display. Add a task-agent span between the
task body and dismiss button, styled consistently with row-agent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TestGenerateInsight_DefaultAgent was calling insight.Generate directly,
which invokes `claude -p` if the CLI is on PATH. The spawned claude
process creates a session file that the file watcher syncs into the
production database.

Add insight.GenerateFunc type and Server.generateFunc field (defaulting
to insight.Generate) so tests can inject a stub. The test now uses a
stub that returns an error, matching the existing assertion without
spawning any subprocess.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- TestCleanEnv: use exact key/value map lookups instead of prefix
  matching to avoid false positives (e.g. CLAUDE_NO_SOUND=10)
- WithGenerateFunc: guard against nil to prevent runtime panics
- TestGenerateInsight_DefaultAgent: assert agent == "claude" to
  verify default-agent assignment is not silently regressed

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (fd3d133)

Verdict: The changes introduce useful functionality but contain medium-severity security flaws related to environment variable leakage to child processes.

Medium

1. Credential Exposure via Environment Variable Denylist

  • File: internal/insight/generate.go:64-78

Description: cleanEnv() uses a denylist approach, copying almost the entire host process environment (os.Environ()) and explicitly filtering out only specific keys like ANTHROPIC_API_KEY and CLAUDECODE. This means all other environment variables—which may include sensitive data like cloud credentials
, database passwords, or other API keys—are blindly passed down to the AI agent child processes (claude, codex, gemini). Since these agents process potentially untrusted user input, an attacker could craft a malicious prompt instructing the AI to read and output its environment variables, leading to secret exfiltration.

  • Suggested Remediation: Adopt a strict allowlist approach instead of a denylist. Only pass the strictly necessary environment variables to the child processes (e.g., PATH, HOME, USER, LANG) along with any specific flags like CLAUDE_NO_SOUND=1.

2.
Case-Sensitive Environment Filtering Bypass on Windows

  • File: internal/insight/generate.go:70
  • Description: The secret stripping logic uses case-sensitive exact matches. On platforms with case-insensitive environment variables (notably Windows), a differently-cased key (e.g
    ., anthropic_api_key) can bypass the filtering and still be visible to the subprocesses.
  • Suggested Remediation: Normalize environment keys before comparison (e.g., strings.ToUpper(k)) or use case-insensitive matching (strings.EqualFold) when filtering sensitive variables.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

wesm and others added 2 commits February 23, 2026 23:06
handleContextError was writing a 504 response on DeadlineExceeded,
racing with http.TimeoutHandler's buffered 503 response. On Windows
CI, Go's select randomly picked the handler's 504 over the
middleware's 503, causing TestMiddleware_Timeout/GetMinimap to fail.

Change handleContextError to return true without writing — same as it
already does for Canceled — letting the withTimeout middleware handle
the response consistently. Also consolidate search.go's inline context
error handling to use handleContextError.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The denylist approach leaked the entire host environment (cloud
credentials, database passwords, other API keys) to agent CLI
subprocesses. Switch to an allowlist of system vars the CLIs need
(PATH, HOME, LANG, proxy settings, etc.) so only safe variables are
passed through. This also eliminates the case-sensitivity bypass on
Windows since the allowlist matches by known prefixes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (17ae11b)

Summary Verdict: The PR introduces a medium-severity security
risk due to incomplete environment variable scrubbing for spawned external CLIs.

Medium

Credential Exposure via Incomplete Environment Blocklist

  • File: internal/insight/generate.go, lines 65-78 (cleanEnv function)
  • Description: The cleanEnv function uses
    a blocklist approach to remove ANTHROPIC_API_KEY and CLAUDECODE from the environment before spawning child processes (claude, codex, gemini). However, it inherits and passes all other environment variables from the parent process (os.Environ()) to these external CLIs
    .
    This has two negative impacts:
    1. Credential Leakage: If the server environment contains sensitive variables (such as database credentials, internal server tokens, or API keys for other services), they are needlessly passed to third-party binaries. These external CLIs could potentially log them, include them in crash
      /telemetry reports, or leak them if the CLI itself is exploited.
    2. Incomplete Fix: The goal is to ensure "CLIs use subscription auth, not env keys." By only blocklisting Anthropic's keys, the server will still leak OPENAI_API_KEY to the
      codex CLI and GEMINI_API_KEY to the gemini CLI if they are set on the host, defeating the purpose of the fix for those specific agents.
  • Suggested Remediation: The most secure practice for subprocesses is to use an allowlist instead of a blocklist.
    Only pass the environment variables that are strictly necessary for the CLIs to execute (e.g., PATH, HOME, USER, and the newly added CLAUDE_NO_SOUND=1). If an allowlist approach breaks CLI functionality, significantly expand the blocklist to cover all relevant sensitive keys (e
    .g., OPENAI_API_KEY, GEMINI_API_KEY, GOOGLE_API_KEY, OPENAI_ORG_ID, etc.).

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

wesm and others added 2 commits February 23, 2026 23:14
Windows emits env keys with mixed casing (Path, ComSpec) that
didn't match the uppercase allowlist prefixes. Parse the key
separately and compare case-insensitively via strings.ToUpper.
Prefix entries ending with _ (LC_, XDG_, SSL_CERT_) match any key
starting with that prefix; all others require exact match.

Add TestEnvKeyAllowed with table-driven cases covering Windows-style
casing and unknown var exclusion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
review_guidelines was wrapped in a [review_guidelines] section header,
making it a TOML table instead of a top-level string. The threat_model
key inside it was also wrong — the field is review_guidelines.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (6e43d2c)

Review complete: 1 Medium severity issue identified regarding environment variable filtering; no security vulnerabilities detected.

Medium

Regression risk (agent auth via env vars may break)
*
Files: internal/insight/generate.go:69, [internal/insight/generate.go:101](/home/roborev/.rob
orev/clones/wesm/agentsview/internal/insight/generate.go:101)*

cleanEnv() now drops all non-allowlisted vars, including common credential vars (ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY , etc.). That improves secret isolation, but it can break users who authenticate CLIs via env vars.
Suggested fix: Add per-agent credential passthrough (only for the selected agent), or make credential passthrough explicitly configurable.


Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

The previous approach of suppressing writes in handleContextError
could produce implicit 200s if a handler were ever called without
withTimeout. Restore the 504 write for DeadlineExceeded so handlers
always produce a concrete error response.

The middleware test race (503 vs 504) was caused by using an
already-expired context, making Go's select pick randomly between
the handler and middleware channels. Fix by accepting either 503 or
504 in middleware tests, since both are valid timeout responses.
The internal test (no middleware) deterministically expects 504.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (9ef6293)

Verdict: The changeset
is secure and generally clean, with one medium-severity regression risk identified.

Medium

  • internal/insight/generate.go#L69

    • Problem: The env allowlist omits common Windows profile/config vars like APPDATA and LOCALAPPDATA. Some CLIs resolve auth/config/cache paths via those vars, so insight generation can fail on Windows even though casing is handled.
    • Suggested fix: Add
      APPDATA and LOCALAPPDATA (optionally PROGRAMDATA) to allowedKeyPrefixes, and add tests for those keys in TestEnvKeyAllowed.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

wesm and others added 2 commits February 24, 2026 06:34
Windows CLIs resolve auth/config/cache paths via these vars. Without
them, insight generation can fail on Windows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add assertTimeoutRace helper that checks status code, Content-Type,
and error body based on which path won: 503 expects "request timed
out", 504 expects "gateway timeout" with application/json. Replaces
bare status-code-only checks in TestMiddleware_Timeout and
TestSearch_DeadlineExceeded.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (c7aca52)

Synthesis unavailable. Showing raw review outputs.

Agent: codex | Type: security | Status: done

Reviewing the 11-commit combined diff with a security-first pass. I’ll map each potential issue to exact file/line references in the current tree and then report only security-relevant findings by severity.
I found the working tree is at a different code snapshot than the provided diff, so I’m validating directly against the commit range via git history to preserve accurate line references for this review.
Summary:

  • Reviewed the combined diff across all 11 commits for injection, authz/authn, credential exposure, path traversal, unsafe execution/deserialization patterns, dependency/CI risk, data leakage, concurrency-related security impact, and error-handling leakage.
  • No new exploitable security vulnerabilities were identified in the changes shown.

No issues found.

Agent: gemini | Type: security | Status: failed

Error: Review failed. Check CI logs for details.

Agent: codex | Type: default | Status: done

Reviewing this commit range as a code review pass focused on correctness, security, regressions, and test coverage. I’ll inspect the modified server and insight code paths first, then validate whether the new tests actually protect the intended behavior.
The repo layout differs from the paths in the diff, so I’m locating the actual file locations before continuing the review to avoid missing context.
I’ve narrowed in on the timeout/error-handling and injectable insight-generation changes now. Next I’m reading the timeout middleware and related handlers to confirm whether the new “503 or 504” test contract can mask real regressions.
I found a likely testing-coverage concern around the new timeout race assertion; I’m checking whether other middleware tests still enforce the 503 JSON contract so I can confirm if this is a real gap or already covered elsewhere.
I confirmed the referenced commits exist in history, so I’m now reading the exact range directly from git to verify there aren’t additional changes outside the pasted diff before finalizing findings.
I’ve gathered exact line references from the final commit state. I’m doing one last pass for concrete, non-speculative issues before I return the findings list.
Summary: This range hardens insight subprocess execution by switching to an env allowlist, adds test-time injection for insight generation to avoid invoking real CLIs, updates timeout-path behavior/tests for 503 vs 504 races, fixes .roborev.toml schema shape, and adds agent labels in the insights UI.

Issues found:

  1. Severity: medium
    File: internal/insight/generate.go:69
    The new env allowlist is likely incomplete for Windows subprocess behavior (PATHEXT and WINDIR are not passed). Some CLIs or child process launches on Windows rely on these, so this can cause agent CLI failures/regressions in Windows environments.
    Suggested fix: include essential Windows runtime vars (PATHEXT, WINDIR, and potentially HOMEDRIVE/HOMEPATH) in the allowlist, then add a Windows-specific test covering CLI invocation under sanitized env.

  2. Severity: low
    File: internal/server/server_test.go:365
    assertTimeoutRace validates JSON Content-Type only for 504, not for 503. That weakens contract coverage for timeout middleware responses and can miss regressions where 503 returns non-JSON.
    Suggested fix: in the 503 branch, also assert Content-Type == application/json and validate/parsing of {"error":"request timed out"}.

  3. Severity: low
    File: internal/insight/generate_test.go:172
    TestCleanEnv asserts exact-case keys (PATH, HOME, etc.) from os.Environ(). On Windows, env keys are case-insensitive and often returned with different casing (Path), so this test can be brittle or misleading cross-platform.
    Suggested fix: normalize env map keys to upper-case before assertions, and/or assert with case-insensitive lookup.

Agent: gemini | Type: default | Status: failed

Error: Review failed. Check CI logs for details.

…est keys

- Add PATHEXT, WINDIR, HOMEDRIVE, HOMEPATH to env allowlist so agent
  CLIs can resolve executables and config paths on Windows
- assertTimeoutRace now checks Content-Type for both 503 and 504
- TestCleanEnv normalizes env map keys to uppercase for
  cross-platform correctness (Windows returns Path not PATH)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (67e31a4)

Synthesis unavailable. Showing raw review outputs.

Agent: codex | Type: security | Status: done

Summary: I reviewed the full diff across the 12 commits with focus on injection, auth/access control, credential handling, path traversal, unsafe patterns, CI/CD risk, data exposure, concurrency, and error handling. The key security-relevant change (internal/insight/generate.go) moves from env denylisting to allowlisting and reduces secret leakage to agent subprocesses; I did not identify a new exploitable vulnerability in the introduced code.

No issues found.

Agent: gemini | Type: security | Status: failed

Error: Review failed. Check CI logs for details.

Agent: codex | Type: default | Status: done

Reviewing the provided commit-range diff directly for correctness, security, regressions, and test coverage. I’ll inspect the touched server/insight paths for behavior changes and then report findings with severity and file references.
I hit path mismatches in this checkout, so I’m locating the corresponding files before continuing the diff review to ensure line references are accurate.
I’m now checking for subtle regressions from the env allowlist and timeout-race handling, since those are the highest-risk behavior changes in this diff.
Brief summary: these commits harden agent subprocess environment handling (denylist -> allowlist, Windows casing/runtime vars), make insight generation injectable for tests, adjust timeout/deadline handling to tolerate 503/504 race behavior, surface agent name in the insights UI, and fix .roborev.toml schema shape.

  1. Medium - Potential auth regression from strict env allowlist
    File: internal/insight/generate.go:61, internal/insight/generate.go:109, internal/insight/generate.go:123
    cleanEnv() now strips all non-allowlisted vars before launching claude/codex/gemini. That also strips common env-based auth vars (for example OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY), which can break users who authenticate CLIs via environment instead of local credential stores.
    Suggested fix: allow explicit pass-through for agent-auth env vars (or an opt-in config), and document the behavior.

  2. Low - API error-message contract changed for search timeouts
    File: internal/server/search.go:64, internal/server/response.go:30
    handleSearch moved from endpoint-specific "search timed out" to shared "gateway timeout" via handleContextError. Status is still correct, but clients/tests keyed to message text may regress.
    Suggested fix: keep endpoint-specific timeout text in handleSearch, or codify that clients must rely on HTTP status (not error text).

  3. Low (testing gap) - Missing behavioral test around real generator env requirements
    File: internal/insight/generate_test.go:158
    Tests validate key filtering logic, but there’s no integration-style check that each generator path (generateClaude, generateCodex, generateGemini) still has all required runtime env on each OS after filtering.
    Suggested fix: add generator-level tests (with exec stubs) asserting expected env composition per command, plus at least one Windows CI validation path.

Agent: gemini | Type: default | Status: failed

Error: Review failed. Check CI logs for details.

Remove cmd.Env = cleanEnv() from all three generator functions.
The cleanEnv helper and its tests are retained for potential future
use but no longer applied to agent CLI subprocesses.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (363de8d)

Summary Verdict: The codebase has one medium severity issue regarding unused environment variable filtering that exposes subprocesses to sensitive
credentials.

Medium

Env allowlist is implemented but not enforced in subprocess execution

  • Location: internal/insight/generate.go (lines 64, 103, 122, 164, 293)
  • Description: The
    cleanEnv function and its associated allowlist (allowedKeyPrefixes, envKeyAllowed) are defined to prevent sensitive credentials (e.g., ANTHROPIC_API_KEY) from being leaked to agent CLI subprocesses. However, these functions are never used to sanitize the environment. Because the exec. Cmd objects constructed to invoke the agent CLIs do not have their Env field set, they default to inheriting the entire unrestricted environment of the parent process. This defeats the intended security control, exposing all environment variables to third-party tools, and leaves dead code in the file.
  • Suggested Remediation:
    Apply cleanEnv() to the Env field of the exec.Cmd when constructing the commands for the agent CLIs (e.g., add cmd.Env = cleanEnv()). Alternatively, if full-environment inheritance is intentional, remove the unused functions and their associated tests.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

@wesm wesm merged commit d17bb69 into main Feb 24, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant