-
Notifications
You must be signed in to change notification settings - Fork 2
research(security): OpenClaw security analysis — prompt injection, tool hijacking, and exfiltration vectors in reference agent (arXiv:2603.11619) #2500
Description
Summary
arXiv:2603.11619 — Taming OpenClaw: Security Analysis and Mitigation (March 2026)
Direct competitive-parity value: independent security analysis of OpenClaw, one of Zeph's primary reference agents. Documents concrete attack vectors and proposed mitigations.
Findings relevant to Zeph:
-
Skill injection via channel messages: malicious instructions embedded in Telegram/Slack messages trigger unintended skill execution. OpenClaw lacks per-channel skill whitelisting.
- Zeph status:
zeph-channelshas no per-channel skill allowlist; Telegram handler passes messages to core without channel-level filtering
- Zeph status:
-
Tool output exfiltration via markdown: tool results containing
[text](https://attacker.com/...)links are rendered and auto-fetched in some clients- Zeph status: web scrape tool output is inserted into context without URL allowlist check
-
Credential leakage via debug output: verbose debug logging exposes vault-resolved API keys in plaintext
- Zeph status:
zeph-corehas aredactmodule but it may not cover all debug paths (check RUST_LOG=debug output)
- Zeph status:
-
SKILL.md header injection: attacker uploads a SKILL.md with injected system-level instructions in the
descriptionfield- Zeph status: SKILL.md parser does not sanitize the
descriptionfield before use in prompt context
- Zeph status: SKILL.md parser does not sanitize the
Proposed Actions
- Per-channel skill allowlist config:
[telegram.skills] allowed = ["*"]with optional restriction - URL allowlist for web scrape tool output rendered into context
- Audit RUST_LOG=debug output for vault key leakage (include in next live test session)
- Sanitize SKILL.md
descriptionandinstructionsfields (strip prompt injection patterns) before context insertion
References
- Paper: https://arxiv.org/abs/2603.11619
- Related: research(security): MCP/A2A protocol threat modeling — shadowing attacks, privilege escalation, coarse-grained tokens (arXiv:2602.11327) #2496 (MCP/A2A threat modeling), research(security): SMCP — formal security hardening for MCP lifecycle (tool poisoning, stolen credentials, privilege escalation) (arXiv:2602.01129) #2497 (SMCP hardening), research(security): cross-tool prompt injection taxonomy — 7 MCP clients evaluated, static validation insufficient (arXiv:2603.21642) #2480 (prompt injection taxonomy)