Commit db450db
feat: harden Conversation Guardian with evasion resistance, thread safety, transcript audit
- Add normalize_text() for evasion resistance: homoglyphs, leetspeak, zero-width chars, fullwidth, combining diacritics
- Add thread safety with threading.Lock() on all shared state
- Add TranscriptEntry with SHA-256 content hashing and configurable max entries
- Expand escalation patterns from 20 to 34 (circumvent, break through, by any means, etc.)
- Expand offensive patterns from 23 to 42 (SQL injection, RCE, lateral movement, DNS tunneling, etc.)
- Dual-text matching: patterns run on both original and normalized text to prevent false negatives
- Add 31 new tests: evasion (8), thread safety (2), transcript audit (9), new patterns (12)
- All 99 tests passing (79 guardian + 20 existing)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>1 parent 5d98273 commit db450db
File tree
2 files changed
+572
-114
lines changed- packages/agent-os
- src/agent_os/integrations
- tests
2 files changed
+572
-114
lines changed
0 commit comments