Skip to content

fix: sandbox insight agents, add Copilot support (#175)#176

Merged
wesm merged 8 commits intomainfrom
fix/issue-175
Mar 15, 2026
Merged

fix: sandbox insight agents, add Copilot support (#175)#176
wesm merged 8 commits intomainfrom
fix/issue-175

Conversation

@wesm
Copy link
Copy Markdown
Owner

@wesm wesm commented Mar 15, 2026

Summary

  • Sandbox all insight agent CLIs (Claude, Codex, Copilot, Gemini) by
    setting cmd.Dir to a temp directory, filtering env vars through an
    allowlist, and disabling tool access
  • Fix Codex failing outside git repos with --skip-git-repo-check
  • Add Copilot CLI as a new insights agent
  • Reframe custom prompt section from "Additional Context" to "User Query"
    with directive framing
  • Preserve provider auth env vars (API keys, tokens) in the allowlist so
    env-based authentication still works

Test plan

  • All insight unit tests pass (go test ./internal/insight/)
  • Server tests pass (go test ./internal/server/)
  • Verified claude, codex, copilot, and gemini CLIs all
    accept the new flags against real binaries
  • Auth env vars (ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.) preserved
    through cleanEnv
  • Manual: generate an insight with each agent in the UI

Closes #175

Generated with Claude Code

wesm and others added 4 commits March 15, 2026 16:04
- Claude: use cleanEnv() instead of os.Environ() to prevent leaking
  secrets to subprocess; add --tools "" to disable all tool access;
  add --no-session-persistence for throwaway sessions
- Codex: add --skip-git-repo-check and --ephemeral flags; set
  cleanEnv() to prevent secret leakage
- Gemini: add --sandbox flag to enable sandboxed execution; set
  cleanEnv() for consistency
- All agents: set cmd.Dir to os.TempDir() so they run from a known
  safe directory instead of wherever agentsview was launched
- Prompt: rename "Additional Context" to "User Query" with framing
  that directs the agent to prioritize the user's specific request

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add generateCopilot() using -p (prompt as arg), --silent,
  --no-custom-instructions, --no-ask-user, --available-tools
  (no tools) for sandboxed non-interactive execution
- Plain text output parsed from stdout (no JSON needed)
- Update ValidAgents, server validation, AgentName type, and
  agent dropdown to include copilot
- Add TestGenerateCopilot_CLIFlags and TestGenerateCopilot_EmptyResult

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY,
GOOGLE_API_KEY, GOOGLE_APPLICATION_CREDENTIALS, GITHUB_TOKEN,
GH_TOKEN, and COPILOT_ prefix to the env allowlist so users
who authenticate via environment variables (including desktop.env)
can generate insights without auth failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace --available-tools (which doesn't effectively restrict tools)
with --disable-builtin-mcps to prevent MCP server access. Built-in
tools remain available but are mitigated by running from a temp
directory with a filtered environment and --no-ask-user.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Mar 15, 2026

roborev: Combined Review (7467c1b)

The PR successfully adds Copilot support and tightens subprocess environments, but requires several medium-severity fixes for credential scoping, output formatting, argument length limits,
and a regression in Claude's configuration.

Medium

Output Formatting Regression in Copilot

  • Locations: generate.go:191,
    generate.go:495, [generate.go:518](/home/roborev/.roborev/clones/wes
    m/agentsview/internal/insight/generate.go#L518)
  • Description: collectStreamLines() drops blank lines (if trimmed != "") and rejoins the remaining lines with a single \n. generateCopilot() uses that helper as its result buffer, so multi
    -paragraph Copilot markdown loses paragraph breaks and formatting before it is returned.
  • Recommendation: Keep a raw stdout buffer for the returned content and use line-based collection only for log streaming. Add a regression test covering multi-line output.

Broad Credential Exposure in Subprocesses

  • Locations:
    generate.go:106, [generate.go:154](/home/roborev/.roborev/clones/wes
    m/agentsview/internal/insight/generate.go#L154)
  • Description: cleanEnv() forwards multiple API keys and credentials globally, and that same environment is reused for every insight agent subprocess. This violates least privilege by handing unrelated provider and GitHub credentials to the selected CLI, widening
    credential exposure if a binary behaves unexpectedly.
  • Recommendation: Build the child environment per agent instead of globally. Pass only the credentials required by the selected CLI, and replace the broad COPILOT_ prefix with an explicit allowlist.

**Missing CLAUDE_NO_SOUND=1 Environment Variable
**

  • Location: internal/insight/generate.go:234
  • Description: Regression: CLAUDE_NO_SOUND=1 was accidentally removed when cmd.Env was updated to use cleanEnv(), which may cause the Claude CLI to play audible
    chimes during generation.
  • Recommendation: Restore the environment variable, e.g., cmd.Env = append(cleanEnv(), "CLAUDE_NO_SOUND=1").

Command-Line Argument Length Limits and Prompt Exposure

  • Locations: internal/insight/generate. go:464 (and prompt.go:68)
  • Description: Passing a potentially large prompt via a command-line argument (-p prompt) risks hitting the OS argument length limits (E2BIG), especially on Windows which has a ~32KB limit. Additionally, command-line
    arguments are exposed to local process inspection tools, which could expose sensitive session context.
  • Recommendation: If the Copilot CLI supports it, pass the prompt via a temporary file or stdin to avoid length limits and local process inspection. Otherwise, explicitly document the limitation and handle the resulting execution errors gracefully.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

- Resolve path-valued env vars (GOOGLE_APPLICATION_CREDENTIALS,
  CURL_CA_BUNDLE) to absolute paths in cleanEnv() so they remain
  valid when subprocesses run from os.TempDir()
- Use --config-dir with a temporary empty directory for copilot to
  prevent user-configured MCP servers in ~/.copilot/ from loading
- Add TestCleanEnv_NormalizesRelativePaths regression test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Mar 15, 2026

roborev: Combined Review (e3be905)

Summary Verdict: The PR successfully integrates Copilot and improves subprocess sandboxing, but introduces a High severity bug in environment variable allowlisting and a Medium severity regression in path normalization.

High

  • internal/insight/
    generate.go (around line 120)

    CURL_CA_BUNDLE is added to pathValuedKeys but was omitted from allowed KeyPrefixes. Because cleanEnv drops any key not in the allowlist before checking pathValuedKeys, CURL_CA_BUNDLE will be stripped entirely. This logic error also causes the new TestCleanEnv_NormalizesRelativePaths test to fail unconditionally. Add " CURL_CA_BUNDLE" to the allowedKeyPrefixes array.

Medium

  • internal/insight/generate.go
    and internal/insight/generate.go

    The temp-cwd change means any allowlisted env var that contains a relative filesystem path now resolves against
    /tmp, but pathValuedKeys only fixes GOOGLE_APPLICATION_CREDENTIALS and CURL_CA_BUNDLE. Common allowlisted path vars such as SSL_CERT_FILE, SSL_CERT_DIR, and XDG directory vars are still left relative, so TLS/
    config resolution can regress for users who launch agentsview with relative paths in those vars. Expand the normalization to all allowlisted path-valued vars, or avoid changing cmd.Dir for agents that depend on caller-relative config. This also needs a test that covers at least SSL_CERT_FILE .

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Mar 15, 2026

roborev: Combined Review (0993623)

Verdict: The PR successfully adds Copilot as an insight agent and implements sandboxing, but there are issues with
environment variable filtering and output parsing that require attention.

High

  • internal/insight/generate.go (around line 119)
    CURL_CA_BUNDLE is included in pathValuedKeys and explicitly tested for, but it was not added to the
    allowedKeyPrefixes slice. Consequently, cleanEnv() will completely filter out CURL_CA_BUNDLE, causing tests to fail and breaking custom certificate paths.
    Fix: Add "CURL_CA_BUNDLE" to allowedKeyPrefixes.

Medium

  • internal/insight /generate.go (around lines 120-130, 494-502)
    Over-broad credential forwarding to child CLIs. The new shared cleanEnv() allowlist forwards all provider credentials (e.g., ANTHROPIC_API_KEY , OPENAI_API_KEY, GEMINI_API_KEY, GITHUB_TOKEN) and any COPILOT_* variable to every insight agent process. This exposes unrelated provider credentials to other binaries. Furthermore, the generic COPILOT_ prefix can leak non-auth config
    variables into the subprocess, undermining the intended isolation.
    Fix: Build per-agent environment allowlists and pass only the minimum, specifically-named auth variables required by that particular CLI.

  • internal/insight/generate.go (around lines 176-205, 5
    25-533)

    In generateCopilot, the subprocess output is collected using collectStreamLines(), which drops empty lines. Since Copilot stdout is plain text, dropping empty lines will collapse multi-paragraph markdown and code-block spacing in generated insights.
    Fix: Read
    Copilot stdout raw or update the collector to preserve blank lines for plain-text agents.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

Remove the cleanEnv() allowlist, path normalization, and config-dir
isolation — all of which introduced more problems than they solved.

Agent CLIs need the full parent environment for provider auth (API
keys, tokens, config paths) that varies across providers, users,
and deployment methods. Sandboxing is handled by CLI flags (--tools,
--sandbox, --disable-builtin-mcps, etc.), not env filtering.

Also fix copilot stdout collection: use io.ReadAll instead of
collectStreamLines to preserve blank lines in plain text output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Mar 15, 2026

roborev: Combined Review (0649a86)

Verdict: The PR successfully introduces new insight agents and sandboxing, but contains medium-severity regressions related to environment variable path resolution and incomplete Copilot sandboxing.

Medium

Environment
Variable Path Resolution Regression

  • Location: internal/insight/generate.go:100-112, :183-185, :274-276, :422-423
  • Description: The parent environment is passed through
    unchanged via agentEnv(), while forcing every agent CLI to run from os.TempDir(). This introduces a path-resolution regression for any environment variable whose value is a relative file path (such as GOOGLE_APPLICATION_CREDENTIALS or CURL_CA_BUNDLE). These paths now
    resolve relative to /tmp instead of the original process cwd, causing auth/TLS setup that works in the parent process to fail inside insight generation. Additionally, the earlier regression test that covered this case was dropped.
  • Suggested Fix: Keep the full environment if needed, but normalize known path-valued variables to absolute paths before
    assigning to cmd.Env, or stop changing cmd.Dir.

Incomplete Copilot Sandboxing

  • Location: internal/insight/generate.go:414-423
  • Description: Copilot sandboxing is incomplete. The --disable-builtin- mcps flag is present, but the final version no longer isolates Copilot from user configuration via --config-dir. Consequently, user-configured MCP servers in ~/.copilot can still be loaded during insight generation. This makes Copilot behavior non-deterministic and bypasses the intended "sandboxed insight agent" model
    .
  • Suggested Fix: Restore an empty temporary --config-dir alongside --disable-builtin-mcps, and add a test that asserts the flag is present.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

Remove cmd.Dir = os.TempDir() from all agent invocations. The
original "rogue folder" problem is solved by CLI flags
(--skip-git-repo-check for codex, -p mode for claude). Changing
the working directory broke relative path env vars
(GOOGLE_APPLICATION_CREDENTIALS, CURL_CA_BUNDLE, SSL_CERT_FILE)
without adding real value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Mar 15, 2026

roborev: Combined Review (0228e8d)

Summary Verdict
The PR adds Cop
ilot as an insight agent and updates related configurations, but introduces medium-severity risks regarding environment isolation and command-line length limits.

Findings

Medium

  • [internal/insight/generate.go:409-412](/home/roborev/.roborev/clones/
    wesm/agentsview/internal/insight/generate.go#L409-L412)
    : The generateCopilot() function launches Copilot with --disable-builtin-mcps but fails to adequately isolate the process. By inheriting the full parent environment and lacking an isolated --config -dir or safe cmd.Dir, insight generation can fall back to the user's normal Copilot configuration and current working directory. This makes the execution machine-dependent and potentially exposes broader tool/MCP access than intended.
    • Suggested fix: Restore hard isolation by providing an empty temporary directory for
      --config-dir, setting cmd.Dir to a safe temporary directory, and ensuring all tools/MCPs are strictly disabled.
  • [internal/insight/generate.go:412](/home/roborev/.roborev/clones/wesm/agentsview/internal/insight
    /generate.go#L412)
    : Passing the full prompt directly as a command-line argument ("-p", prompt) risks exceeding OS command-line length limits (e.g., 32KB on Windows, MAX_ARG_STRLEN on Linux) when dealing with large
    session contexts.
    • Suggested fix: Modify the invocation to read the prompt from a temporary file (e.g., -f <file>) if supported by the Copilot CLI, or truncate the prompt length to remain safely within OS limits.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@wesm wesm merged commit 3b510d7 into main Mar 15, 2026
9 checks passed
wesm added a commit that referenced this pull request Mar 16, 2026
## Summary

- Sandbox all insight agent CLIs (Claude, Codex, Copilot, Gemini) by
  setting `cmd.Dir` to a temp directory, filtering env vars through an
  allowlist, and disabling tool access
- Fix Codex failing outside git repos with `--skip-git-repo-check`
- Add Copilot CLI as a new insights agent
- Reframe custom prompt section from "Additional Context" to "User
Query"
  with directive framing
- Preserve provider auth env vars (API keys, tokens) in the allowlist so
  env-based authentication still works

## Test plan

- [x] All insight unit tests pass (`go test ./internal/insight/`)
- [x] Server tests pass (`go test ./internal/server/`)
- [x] Verified `claude`, `codex`, `copilot`, and `gemini` CLIs all
  accept the new flags against real binaries
- [x] Auth env vars (ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.) preserved
  through cleanEnv
- [ ] Manual: generate an insight with each agent in the UI

Closes #175

Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Insights tries to run agents in rogue folders

1 participant