research(security): OpenClaw security analysis — prompt injection, tool hijacking, and exfiltration vectors in reference agent (arXiv:2603.11619)

## Summary

arXiv:2603.11619 — *Taming OpenClaw: Security Analysis and Mitigation* (March 2026)

Direct competitive-parity value: independent security analysis of OpenClaw, one of Zeph's primary reference agents. Documents concrete attack vectors and proposed mitigations.

**Findings relevant to Zeph:**

1. **Skill injection via channel messages**: malicious instructions embedded in Telegram/Slack messages trigger unintended skill execution. OpenClaw lacks per-channel skill whitelisting.
   - *Zeph status*: `zeph-channels` has no per-channel skill allowlist; Telegram handler passes messages to core without channel-level filtering
   
2. **Tool output exfiltration via markdown**: tool results containing `[text](https://attacker.com/...)` links are rendered and auto-fetched in some clients
   - *Zeph status*: web scrape tool output is inserted into context without URL allowlist check
   
3. **Credential leakage via debug output**: verbose debug logging exposes vault-resolved API keys in plaintext
   - *Zeph status*: `zeph-core` has a `redact` module but it may not cover all debug paths (check RUST_LOG=debug output)

4. **SKILL.md header injection**: attacker uploads a SKILL.md with injected system-level instructions in the `description` field
   - *Zeph status*: SKILL.md parser does not sanitize the `description` field before use in prompt context

## Proposed Actions

1. Per-channel skill allowlist config: `[telegram.skills] allowed = ["*"]` with optional restriction
2. URL allowlist for web scrape tool output rendered into context
3. Audit RUST_LOG=debug output for vault key leakage (include in next live test session)
4. Sanitize SKILL.md `description` and `instructions` fields (strip prompt injection patterns) before context insertion

## References
- Paper: https://arxiv.org/abs/2603.11619
- Related: #2496 (MCP/A2A threat modeling), #2497 (SMCP hardening), #2480 (prompt injection taxonomy)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(security): OpenClaw security analysis — prompt injection, tool hijacking, and exfiltration vectors in reference agent (arXiv:2603.11619) #2500

Summary

Proposed Actions

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(security): OpenClaw security analysis — prompt injection, tool hijacking, and exfiltration vectors in reference agent (arXiv:2603.11619) #2500

Description

Summary

Proposed Actions

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions