-
Notifications
You must be signed in to change notification settings - Fork 76
Open
Description
When reviewing patches from untrusted contributors, the agent has filesystem read/write without prompting (--permission-mode acceptEdits), access to Bash in orc.md and review.md, full environment inheritance (Anthropic key, GitHub tokens, SSH, git credentials, etc.) and outbound network (Claude API, lore.kernel.org).
The untrusted input -- commit message, code comments, lore email threads -- can potentially instruct Claude to exfiltrate secrets via Bash and the network. Everything is also logged to review.json. This is mostly an issue for future automated CI pipelines reviewing external contributors.
I would propose:
- Add a minimal image at
kernel/scripts/Dockerfile. This would also provide greater reproducibility. kernel/scripts/run-container.shwrapsreview_one.sh/agent_one.sh, passing only Anthropic key and Claude Model.- Firewall rules limit egress to Anthropic's API and lore.kernel.org.
- For extra security, the API key can be kept outside the container entirely, with a proxy reviewer forwarding requests to Anthropic's API.
- Optionally, a dedicated agent reviews patches for potential injection attempts before the main analysis runs.
I understand CI is not current concern but leaving this issue open for when the time comes.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels