feat(policy): add agentic approval loop#1528
Draft
zredlined wants to merge 5 commits into
Draft
Conversation
Signed-off-by: Alexander Watson <zredlined@gmail.com>
…roval Run the prover on every proposal regardless of analysis_mode. Auto-approve proposals whose merged-policy delta is empty (proposer-agnostic, with the global-policy gate respected). Calibrate prover findings to a single HIGH severity emitted on link-local hosts, L4+credential-in-scope, and bypass-L7-binary+credential-in-scope. Add implicit supersede on (host, port, binary): newer submissions auto-reject older pending chunks, and incoming mechanistic chunks auto-reject when an approved agent_authored chunk already covers the same endpoint. Audit auto-approvals via CONFIG:APPROVED OCSF events carrying auto=true, source=<mode>, prover_delta=empty as unmapped fields, with message text "auto-approved: no new prover findings". Build credential set from sandbox-attached providers (presence only — no scope modeling in v1).
Signed-off-by: Alexander Watson <zredlined@gmail.com>
The prover now answers four formal questions about a proposed policy change and emits one finding per "yes" answer: - link_local_reach - l7_bypass_credentialed - credential_reach_expansion - capability_expansion There is no severity grade. The category name is the signal; the per-path evidence carries the structured detail. The auto-approval gate is binary — empty delta or not. This removes the previous HIGH/MEDIUM/CRITICAL severity tiers and the narrowness classifier that was inconsistent across the access-shorthand / explicit-rules boundary. Gateway-side finding_delta gains category suppression: capability_expansion paths whose (binary, host, port) appears in the credential_reach_expansion delta are suppressed, so a brand-new credentialed reach surfaces as one finding rather than one reach plus N method findings. The github provider profile now defaults api.github.com to read-only (was: read-write). Writes flow through the agentic loop — the prover audits each capability change rather than treating broad write access as the default. Demo, sandbox skill, and architecture docs updated to describe the four-category model. Prover gains a README.md documenting the formal queries, evidence shape, and how to add a new category.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ships the agentic policy approval loop end-to-end. When the sandbox denies a network request, an agent inside the sandbox can propose a narrow policy refinement; the gateway runs a formal prover against the merged-policy delta; safe proposals (no new findings) auto-approve in ~1s; risky ones land in
pendingwith structured evidence the reviewer can act on. The agent waits on a socket — zero LLM tokens burn during human review.This is the loop the platform has been building toward: agents do the narrowing work, the prover catches changes the operator should know about, and the audit trail makes every approval reconstructable.
Closes #1097
Refs #1062
Refs #1532
What this PR ships
The loop. Sandbox denial → agent reads
/etc/openshell/skills/policy_advisor.md→ agent POSTs a narrow proposal topolicy.local→ gateway runs the prover → either auto-approve (empty delta) orpending(any finding) → on approval, sandbox hot-reloads → agent retries.Prover wired in as the auto-approval referee. Every proposal (mechanistic and agent-authored alike) runs through
openshell-prover. The prover answers four categorical questions about the proposed change — see What the prover decides. The gateway computes the delta vs the baseline policy and the auto-approval gate fires only when the delta is empty.Providers-v2 in the loop. The prover validates against the effective policy — provider profiles composed in via providers-v2 are part of the model the prover reasons over. Agent-authored chunks for endpoints a provider profile covers land as their own rules (Fix A in
merge.rs) instead of getting silently absorbed into the provider rule, so the prover sees the agent's narrow contribution honestly.Default-deny posture preserved. Auto-approval is opt-in through the standard settings model: gateway-scoped
proposal_approval_modewins, sandbox-scoped settings apply otherwise, and the default is manual review."auto"enables empty-delta auto-approval. CLI keepsopenshell sandbox create --approval-mode <manual|auto>as shorthand by writing the sandbox setting after creation.Demo that walks the full loop.
examples/agent-driven-policy-management/demo.shruns a Codex agent through a two-path flow against a local gateway: one un-credentialed action auto-approves silently; one credentialed action escalates with a categorical finding, demo.sh approves on behalf, the agent retries and the file lands in GitHub. End-to-end in ~50–110s with one human-visible escalation, exactly the kind the prover cannot decide unilaterally.Reconstructable audit. Every auto-approval emits a
CONFIG:APPROVEDOCSF event with unmapped fieldsauto=true,source=<mechanistic|agent_authored>,prover_delta=empty, andresolved_from=<gateway|sandbox|default>. The chunk's persistedvalidation_resultcarries the categorical finding lines for human-reviewed approvals.Provider profile tightening.
providers/github.yamldefaultsapi.github.comfromread-writetoread-only. Writes (gh / git via REST) now flow through the agentic loop — the loop becomes the on-ramp to write access, and the prover audits each capability change.What the prover decides
The prover answers four formal questions about each proposed change. Each "yes" is its own categorical finding — no severity grade. Any finding blocks auto-approval; empty delta means the change is provably safe under the model.
link_local_reach169.254.0.0/16orfe80::/10(cloud-metadata range, serves credentials).l7_bypass_credentialedgit-remote-https,ssh,nc) reaches a host where a credential is in scope.credential_reach_expansioncapability_expansionDetail in
crates/openshell-prover/README.md.What the demo shows
Acceptance criteria (deterministic, in tests)
approved).pendingwithcredential_reach_expansioninvalidation_result.pendingwithcapability_expansionciting the new method.pendingunconditionally withlink_local_reach.pendingwithl7_bypass_credentialed.(host, port, binary)overlap.manual— empty delta does NOT auto-approve whenproposal_approval_modeis unset,"manual", or any unknown future value.--approval-mode autowrites the sandbox-scoped setting after create.auto=true,source=<mode>,prover_delta=empty, andresolved_from=<gateway|sandbox|default>as unmapped OCSF fields._provider_prefix are rejected at submit time.validation_result.All covered by unit and integration tests in
crates/openshell-server/src/grpc/policy.rs::tests.Testing
cargo test --workspace --lib— 534 gateway tests, all 16 crates green.cargo clippy -p openshell-server -p openshell-cli -p openshell-core --all-targets -- -D warnings— clean.cargo fmt --check— clean../examples/agent-driven-policy-management/demo.shruns end-to-end against the local Docker gateway and writes the demo file to GitHub.Explicitly deferred (follow-up PRs)
CONFIG:AUTO_APPROVEDOCSF event class (today reusesCONFIG:APPROVEDwithauto=trueunmapped).docs/for the agentic loop.Checklist