-
Notifications
You must be signed in to change notification settings - Fork 2
fix(security): ContentSanitizer false positive on Cargo.toml (flags=9) #2515
Copy link
Copy link
Closed
Labels
P2High value, medium complexityHigh value, medium complexitybugSomething isn't workingSomething isn't working
Description
Summary
ContentSanitizer generates false positives when reading standard project files like Cargo.toml and README.md, causing the ExfiltrationGuard to block Qdrant memory writes for legitimate content.
Reproduction
Run the agent with testing.toml and ask it to read Cargo.toml multiple times:
What is in Cargo.toml? Read it again. Now read it one more time.
Observed behavior
WARN zeph_core::agent::tool_execution: injection patterns detected in tool output tool=read flags=9
WARN zeph_core::agent::persistence: exfiltration guard: skipping Qdrant embedding for flagged content event=MemoryWriteGuarded { reason: "content contained injection patterns flagged by ContentSanitizer" }
Expected behavior
Cargo.toml and other standard project files should not trigger injection pattern detection. The SecurityPatterns regex set is over-broad and matches legitimate TOML/Markdown syntax (e.g., [workspace], [dependencies], URLs in README badges, shell commands in code blocks).
Impact
- Semantic memory writes for code-related tasks are silently dropped
- STEM skill learning is degraded because tool output from file reads is not embedded
- ExfiltrationGuard is causing data loss, not just security enforcement
Root cause hypothesis
The 17 LazyLock regex patterns in SecurityPatterns likely match:
- TOML section headers
[...]as potential prompt injection delimiters - Shell command examples in README code blocks
- URL patterns in badge links
Suggested fix
- Tune SecurityPatterns to exclude TOML/Markdown structural syntax
- Add a content-type hint from the tool executor (e.g.,
tool=read path=*.toml) so the sanitizer can apply path-based allowlists - Add a metric for false-positive rate (injection flags on known-safe file types)
Session
CI-351 background task b2ysiegti, 2026-03-31
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P2High value, medium complexityHigh value, medium complexitybugSomething isn't workingSomething isn't working