Conversation
…of tool calls (#2447) Add AdversarialPolicyGateExecutor to zeph-tools: before each tool call, an independent LLM instance validates the call against user-defined plain-language policies loaded from a file. Runs in a separate context with no access to the main conversation history. Key design decisions: - Fail-closed by default (fail_open = false): LLM errors → deny - Strict response parsing: only "ALLOW"/"DENY" accepted; anything else → deny - Tool params wrapped in triple-backtick code fence to prevent prompt injection - PolicyLlmClient trait with AdversarialPolicyLlmAdapter wired in runner.rs - All ToolExecutor delegation methods implemented (set_skill_env, set_effective_trust, etc.) - Executor chain order: PolicyGateExecutor → AdversarialPolicyGateExecutor → TrustGateExecutor - Policy file loaded with canonicalize + boundary check (SEC-01) matching load_policy_file() - Default timeout 3000ms (fast model budget), configurable via tools.policy.timeout_ms - adversarial_policy_decision field added to AuditEntry (additive, optional) - Gated behind policy-enforcer feature flag Config: [tools.policy] enabled = false policy_provider = "fast" policy_file = "policies.txt" fail_open = false timeout_ms = 3000
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the adversarial policy agent (#2447): before each tool call is dispatched, an independent LLM instance validates the call against user-defined plain-language policies. The validator runs in a separate context with no access to main conversation history.
AdversarialPolicyGateExecutor<T>inzeph-tools, gated onpolicy-enforcerfeaturePolicyValidatorwithPolicyLlmClienttrait andAdversarialPolicyLlmAdapterbridge inrunner.rsfail_open = false): LLM errors → denyALLOW/DENYaccepted; anything else → denyToolExecutortrait delegation (CRIT-06)canonicalize+ config-directory boundary check (SEC-01)PolicyGateExecutor → AdversarialPolicyGateExecutor → TrustGateExecutoradversarial_policy_decisionfield added toAuditEntry(additive, optional)Config
Test plan
cargo +nightly fmt --check— cleancargo clippy --all-targets --features full --workspace -- -D warnings— cleancargo nextest run --config-file .github/nextest.toml --workspace --features full --lib --bins— 7348/7348 passCloses #2447