Commit d8f305c
authored
Add AdversarialPolicyGateExecutor to zeph-tools: before each tool call, an
independent LLM instance validates the call against user-defined plain-language
policies loaded from a file. Runs in a separate context with no access to the
main conversation history.
Key design decisions:
- Fail-closed by default (fail_open = false): LLM errors → deny
- Strict response parsing: only "ALLOW"/"DENY" accepted; anything else → deny
- Tool params wrapped in triple-backtick code fence to prevent prompt injection
- PolicyLlmClient trait with AdversarialPolicyLlmAdapter wired in runner.rs
- All ToolExecutor delegation methods implemented (set_skill_env, set_effective_trust, etc.)
- Executor chain order: PolicyGateExecutor → AdversarialPolicyGateExecutor → TrustGateExecutor
- Policy file loaded with canonicalize + boundary check (SEC-01) matching load_policy_file()
- Default timeout 3000ms (fast model budget), configurable via tools.policy.timeout_ms
- adversarial_policy_decision field added to AuditEntry (additive, optional)
- Gated behind policy-enforcer feature flag
Config:
[tools.policy]
enabled = false
policy_provider = "fast"
policy_file = "policies.txt"
fail_open = false
timeout_ms = 3000
1 parent 4b34cca commit d8f305c
File tree
12 files changed
+1203
-5
lines changed- crates
- zeph-core/src/agent/tool_execution
- zeph-tools/src
- shell
- src
12 files changed
+1203
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
434 | 434 | | |
435 | 435 | | |
436 | 436 | | |
| 437 | + | |
437 | 438 | | |
438 | 439 | | |
439 | 440 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
764 | 764 | | |
765 | 765 | | |
766 | 766 | | |
| 767 | + | |
767 | 768 | | |
768 | 769 | | |
769 | 770 | | |
| |||
0 commit comments