|
| 1 | +# Governance & Risk Model |
| 2 | + |
| 3 | +This document explains how Provenance evaluates risk, determines allow/warn/block decisions, and which configuration knobs influence the outcome. |
| 4 | + |
| 5 | +## Evaluation Flow |
| 6 | + |
| 7 | +1. **Provenance Coverage** – Each changed line submitted with the analysis is examined for an agent attribution. Coverage metrics (total, attributed, unknown) inform enforcement when `PROVENANCE_BLOCK_ON_UNKNOWN=true`. |
| 8 | +2. **Finding Aggregation** – Detectors produce findings with categories and severity levels. Governance summarizes totals, per-category counts, and severity buckets. |
| 9 | +3. **Policy Thresholds** – Outcomes are derived by comparing summaries against configured thresholds and default heuristics. |
| 10 | +4. **Review & Commit Signals** – When GitHub enrichment is enabled, governance inspects review overrides and force-push activity to adjust the outcome and emit alerts. |
| 11 | +5. **Decision Bundling** – The final decision, risk summary, and inputs digest are wrapped in a DSSE envelope (optionally signed). |
| 12 | + |
| 13 | +## Policy Outcomes |
| 14 | + |
| 15 | +The decision pipeline evaluates conditions in priority order: |
| 16 | + |
| 17 | +1. **Unknown Provenance** – If `PROVENANCE_BLOCK_ON_UNKNOWN=true` and any line lacks attribution, the outcome is `block` with rationale “Unknown agents detected…”. |
| 18 | +2. **Critical Findings** – Any `critical` severity finding forces `block`. |
| 19 | +3. **High Severity Threshold** – If `risk_high_severity_threshold` is reached (default: `1` high finding), the outcome falls to `warn`. |
| 20 | +4. **Category Thresholds** – `PROVENANCE_POLICY_BLOCK_THRESHOLDS` and `PROVENANCE_POLICY_WARN_THRESHOLDS` map finding categories to numeric limits (e.g., `{"secrets": 1}`). When exceeded, outcomes escalate to `block` or `warn`. |
| 21 | +5. **Review Overrides / Force Pushes** – GitHub metadata can escalate to `warn` or `block` if bot reviews were bypassed or force-pushes landed after approval. |
| 22 | +6. **Default Allow** – If none of the above are triggered, the analysis is `allow`. |
| 23 | + |
| 24 | +The rationale captures the first trigger encountered to keep explanations concise. |
| 25 | + |
| 26 | +## Configuration Reference |
| 27 | + |
| 28 | +| Setting | Default | Description | |
| 29 | +| --- | --- | --- | |
| 30 | +| `PROVENANCE_BLOCK_ON_UNKNOWN` | `false` | Block analyses when any changed line lacks agent attribution. | |
| 31 | +| `PROVENANCE_RISK_HIGH_SEVERITY_THRESHOLD` | `1` | Number of `high` findings that trigger a `warn`. | |
| 32 | +| `PROVENANCE_POLICY_WARN_THRESHOLDS` | `{}` | JSON mapping of finding category → warn threshold (inclusive). | |
| 33 | +| `PROVENANCE_POLICY_BLOCK_THRESHOLDS` | `{}` | JSON mapping of finding category → block threshold (inclusive). | |
| 34 | +| `PROVENANCE_DECISION_SIGNING_KEY` | unset | Base64 Ed25519 private key. Enables signing of DSSE bundles. | |
| 35 | +| `PROVENANCE_DECISION_KEY_ID` | `"decision-key"` | Optional key identifier embedded in signature records. | |
| 36 | +| `PROVENANCE_DEFAULT_POLICY_VERSION` | `2024-06-01` | Version stamp included in decisions for audit tracking. | |
| 37 | + |
| 38 | +> The full environment variable list lives in [Configuration Reference](configuration.md). This table highlights the governance-specific controls. |
| 39 | +
|
| 40 | +## Risk Summary Schema |
| 41 | + |
| 42 | +Every decision exports a `risk_summary` block: |
| 43 | + |
| 44 | +```json |
| 45 | +{ |
| 46 | + "findings_total": 3, |
| 47 | + "findings_by_category": {"code_execution": 2, "secrets": 1}, |
| 48 | + "findings_by_severity": {"high": 2, "critical": 1}, |
| 49 | + "coverage": { |
| 50 | + "total_lines": 22, |
| 51 | + "attributed_lines": 18, |
| 52 | + "unknown_line_count": 4, |
| 53 | + "coverage_percent": 81.82 |
| 54 | + }, |
| 55 | + "bot_block_overrides": 1, |
| 56 | + "bot_block_resolved": 1, |
| 57 | + "force_push_after_approval": true |
| 58 | +} |
| 59 | +``` |
| 60 | + |
| 61 | +- `coverage` quantifies attribution confidence and feeds both alerting and DSSE payloads. |
| 62 | +- Optional GitHub metadata fields (`bot_block_overrides`, etc.) appear when enrichment is enabled. |
| 63 | + |
| 64 | +## Weighted Risk Score (Planned) |
| 65 | + |
| 66 | +The roadmap includes a composite risk score that blends severity, coverage, and review heuristics. Upcoming changes will add: |
| 67 | + |
| 68 | +- `risk_score` – Numeric index (0–100) aggregating weighted factors. |
| 69 | +- `score_breakdown` – Component contributions (e.g., `{"coverage": 20, "severity": 50, "review": 10}`). |
| 70 | +- Configurable weights via `PROVENANCE_RISK_WEIGHTS`. |
| 71 | + |
| 72 | +Once implemented, governance decisions will still rely on hard thresholds for blocking, but the score will enrich analytics views and downstream automation. |
| 73 | + |
| 74 | +## Tuning Guidance |
| 75 | + |
| 76 | +1. **Start Conservative** – Block on critical findings, warn on high severity bursts, and observe review overrides before enforcing attribution coverage. |
| 77 | +2. **Iterate on Categories** – Align category thresholds with detector packs (e.g., treat “secrets” differently from “lint”). |
| 78 | +3. **Use DSSE Bundles** – Signatures provide a tamper-evident record of enforcement logic. Verify bundles in CI to ensure configuration drift doesn’t silently relax policies. |
| 79 | +4. **Monitor Analytics** – `/v1/analytics/summary` and `/v1/analytics/agents/behavior` reveal whether thresholds are too aggressive or lenient. |
0 commit comments