Skip to content

Commit 3406ce1

Browse files
committed
docs: add deployment and api cookbook
1 parent 34314f9 commit 3406ce1

File tree

4 files changed

+352
-0
lines changed

4 files changed

+352
-0
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -222,9 +222,11 @@ The same process works against forks or sandboxes—helpful when validating new
222222
## Additional Documentation
223223

224224
- [CI Integration Guide](docs/ci-integration.md) – Configure GitHub Actions, upload SARIF, archive decision bundles, and adapt the workflow to other CI systems.
225+
- [API Cookbook](docs/api-cookbook.md) – Use curl/Python snippets to ingest analyses, poll decisions, and pull evidence or analytics.
225226
- [Governance & Risk Model](docs/governance-and-risk-model.md) – Understand decision flow, thresholds, and tuning guidance.
226227
- [Configuration Reference](docs/configuration.md) – Environment variables grouped by subsystem with defaults and usage tips.
227228
- [Detector Authoring Guide](docs/detector-authoring.md) – Build custom detectors, register modules, and manage rule packs.
229+
- [Deployment & Operations Guide](docs/deployment-guide.md) – Deploy with Docker/Kubernetes, scale detectors, and instrument observability.
228230
- [SARIF Reporting](docs/sarif-reporting.md) – Understand the SARIF 2.1.0 output, severity mapping, and customization hooks.
229231
- [DSSE Decision Bundles](docs/dsse-decision-bundles.md) – Inspect the envelope schema, verify signatures, and integrate with transparency logs.
230232

docs/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,10 @@
33
This directory contains task-focused guides that go deeper than the root `README.md`.
44

55
- [CI Integration Guide](ci-integration.md) — Automate Provenance evaluations in GitHub Actions and other CI pipelines, upload SARIF findings, and archive DSSE bundles.
6+
- [API Cookbook](api-cookbook.md) — Practical curl and Python examples for submitting analyses, polling decisions, and retrieving evidence.
67
- [Governance & Risk Model](governance-and-risk-model.md) — Learn how policy decisions are made and how to tune thresholds.
78
- [Configuration Reference](configuration.md) — Environment variables grouped by subsystem with defaults and usage tips.
89
- [Detector Authoring Guide](detector-authoring.md) — Extend Provenance with custom detectors and rule packs.
10+
- [Deployment & Operations Guide](deployment-guide.md) — Deploy the API with Docker or Kubernetes and operate it in production.
911
- [SARIF Reporting](sarif-reporting.md) — Understand the SARIF 2.1.0 output and tailor it for downstream scanners.
1012
- [DSSE Decision Bundles](dsse-decision-bundles.md) — Inspect the DSSE envelope, verify signatures, and extend transparency workflows.

docs/api-cookbook.md

Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
# API Cookbook
2+
3+
This cookbook provides practical examples for interacting with the Provenance API—submitting analyses, polling decisions, retrieving evidence, and querying analytics.
4+
5+
## Authentication
6+
7+
All examples assume bearer token authentication:
8+
9+
```bash
10+
export PROVENANCE_API_URL="https://provenance.example.com"
11+
export PROVENANCE_API_TOKEN="your-api-token"
12+
```
13+
14+
Headers:
15+
16+
```bash
17+
-H "Authorization: Bearer ${PROVENANCE_API_TOKEN}"
18+
```
19+
20+
## Submit an Analysis
21+
22+
### cURL
23+
24+
```bash
25+
cat <<'JSON' > payload.json
26+
{
27+
"repo": "acme/shop",
28+
"pr_number": "77",
29+
"base_sha": "abc123",
30+
"head_sha": "def456",
31+
"branch": "feature/harden",
32+
"provenance_data": {
33+
"metadata": {
34+
"attestation_url": "https://evidence.example.com/claims/123"
35+
},
36+
"changed_lines": [
37+
{
38+
"file_path": "services/orders.py",
39+
"line_number": 10,
40+
"change_type": "added",
41+
"language": "python",
42+
"content": "result = eval(user_input)",
43+
"attribution": {
44+
"agent_id": "github-copilot",
45+
"agent_session_id": "sess-1"
46+
}
47+
}
48+
]
49+
}
50+
}
51+
JSON
52+
53+
curl -sSf \
54+
-H "Authorization: Bearer ${PROVENANCE_API_TOKEN}" \
55+
-H "Content-Type: application/json" \
56+
-d @payload.json \
57+
"${PROVENANCE_API_URL}/v1/analysis"
58+
```
59+
60+
Response:
61+
62+
```json
63+
{
64+
"analysis_id": "an_b83b21d984f34c8f9abc0f2c82d4b79d",
65+
"status": "pending",
66+
"status_url": "https://provenance.example.com/v1/analysis/an_b83b21d984f34c8f9abc0f2c82d4b79d"
67+
}
68+
```
69+
70+
### Python (httpx)
71+
72+
```python
73+
import httpx
74+
75+
client = httpx.Client(base_url="https://provenance.example.com", headers={
76+
"Authorization": f"Bearer {TOKEN}",
77+
})
78+
79+
resp = client.post(
80+
"/v1/analysis",
81+
json={...} # same payload as above
82+
)
83+
resp.raise_for_status()
84+
analysis_id = resp.json()["analysis_id"]
85+
```
86+
87+
## Poll Analysis Status
88+
89+
```bash
90+
ANALYSIS_ID="an_b83b21d984f34c8f9abc0f2c82d4b79d"
91+
92+
curl -sSf \
93+
-H "Authorization: Bearer ${PROVENANCE_API_TOKEN}" \
94+
"${PROVENANCE_API_URL}/v1/analysis/${ANALYSIS_ID}"
95+
```
96+
97+
Key fields:
98+
99+
- `status`: `pending`, `running`, `completed`, or `failed`.
100+
- `findings_total`: Number of findings recorded.
101+
- `risk_summary`: Aggregated findings and coverage.
102+
- `decision`: Serialized policy decision (`outcome`, `rationale`, etc.).
103+
104+
## Fetch Governance Decision Evidence
105+
106+
```bash
107+
curl -sSf \
108+
-H "Authorization: Bearer ${PROVENANCE_API_TOKEN}" \
109+
"${PROVENANCE_API_URL}/v1/analysis/${ANALYSIS_ID}/decision"
110+
```
111+
112+
Use this if you need a pure decision payload (without the surrounding status data).
113+
114+
## Download DSSE Decision Bundle
115+
116+
```bash
117+
curl -sSf \
118+
-H "Authorization: Bearer ${PROVENANCE_API_TOKEN}" \
119+
"${PROVENANCE_API_URL}/v1/analysis/${ANALYSIS_ID}/bundle" \
120+
-o decision-bundle.json
121+
```
122+
123+
Verify the signature with the public key (see [DSSE Decision Bundles](dsse-decision-bundles.md)).
124+
125+
## Retrieve SARIF Report
126+
127+
```bash
128+
curl -sSf \
129+
-H "Authorization: Bearer ${PROVENANCE_API_TOKEN}" \
130+
"${PROVENANCE_API_URL}/v1/analysis/${ANALYSIS_ID}/sarif" \
131+
-o provenance.sarif
132+
```
133+
134+
Upload the SARIF to GitHub or other scanners using the workflow in the [CI Integration Guide](ci-integration.md).
135+
136+
## Query Analytics Summary
137+
138+
```bash
139+
curl -sSf \
140+
-H "Authorization: Bearer ${PROVENANCE_API_TOKEN}" \
141+
"${PROVENANCE_API_URL}/v1/analytics/summary?metric=code_volume&time_window=7d"
142+
```
143+
144+
Response snippet:
145+
146+
```json
147+
{
148+
"result": {
149+
"metric": "code_volume",
150+
"data": [
151+
{"agent_id": "github-copilot", "value": 12.0},
152+
{"agent_id": "claude-3-opus", "value": 4.0}
153+
]
154+
}
155+
}
156+
```
157+
158+
## Python Client Snippets
159+
160+
Use the bundled client (`clients/python`):
161+
162+
```python
163+
from clients.python import ProvenanceClient
164+
165+
client = ProvenanceClient(
166+
base_url="https://provenance.example.com",
167+
api_token=TOKEN,
168+
)
169+
170+
analysis = client.submit_analysis(payload)
171+
decision = client.wait_for_decision(analysis.analysis_id, timeout_s=300)
172+
bundle = client.get_decision_bundle(analysis.analysis_id)
173+
sarif = client.get_sarif(analysis.analysis_id)
174+
```
175+
176+
## Error Handling Tips
177+
178+
- **401 Unauthorized** – Check token validity and that the header is passed correctly.
179+
- **404 Not Found** – Analysis ID is unknown or still processing (bundle/SARIF endpoints return 404 until `completed`).
180+
- **429 Too Many Requests** – Implement exponential backoff when polling status.
181+
- **5xx Errors** – Inspect server logs; detectors may have crashed or external services (Redis, ClickHouse) might be unavailable.
182+
183+
## Sample Script
184+
185+
`clients/github-action/run.py` contains a ready-made CLI for CI. You can run it locally:
186+
187+
```bash
188+
uv run -- python clients/github-action/run.py \
189+
--api-url "${PROVENANCE_API_URL}" \
190+
--api-token "${PROVENANCE_API_TOKEN}" \
191+
--repo "acme/shop" \
192+
--pr "77" \
193+
--head-sha "$(git rev-parse HEAD)" \
194+
--base-sha "$(git merge-base HEAD origin/main)"
195+
```
196+
197+
This script drives the same endpoints showcased above and serves as a reference for bespoke integrations.

docs/deployment-guide.md

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# Deployment & Operations Guide
2+
3+
This guide outlines supported deployment patterns for Provenance, from local Docker runs to production-ready Kubernetes clusters, and highlights operational concerns such as scaling detectors, managing secrets, and monitoring.
4+
5+
## Architecture Overview
6+
7+
Core services:
8+
9+
- **API** – FastAPI application (ASGI) served by `uvicorn`.
10+
- **Redis** – Primary datastore for analyses, findings, and decisions.
11+
- **Optional analytics sinks** – ClickHouse, Snowflake, BigQuery, or file-based JSONL exports.
12+
- **Optional observability** – Prometheus/OTLP exporters for metrics.
13+
14+
Background work (detector execution, governance, analytics) happens inline today; no external workers are required.
15+
16+
## Local & Docker Compose
17+
18+
Use Docker when validating changes locally or running against mocked dependencies:
19+
20+
```bash
21+
docker compose up --build
22+
```
23+
24+
The compose stack includes:
25+
26+
- API container (`provenance-api`) exposing `8000`.
27+
- Redis (`redis:7-alpine`) with a persistent volume.
28+
- Optional ClickHouse (if you enable `docker-compose.clickhouse.yml`).
29+
30+
Override environment variables in `.env` or `docker-compose.override.yml`. See the [Configuration Reference](configuration.md) for available settings.
31+
32+
## Container Image
33+
34+
Build a production image with:
35+
36+
```bash
37+
docker build -t your-registry/provenance:<tag> .
38+
```
39+
40+
Key build arguments:
41+
42+
- `UV_LOCKFILE=uv.lock` – Install pinned dependencies.
43+
- `TARGET_ENV=production` – (Optional) adjust if you customize the Dockerfile stages.
44+
45+
Run the container:
46+
47+
```bash
48+
docker run --rm \
49+
-p 8000:8000 \
50+
-e PROVENANCE_REDIS_URL=redis://host.docker.internal:6379/0 \
51+
your-registry/provenance:<tag>
52+
```
53+
54+
## Kubernetes (Helm/Manifests)
55+
56+
There is no bundled Helm chart yet, but a basic deployment involves:
57+
58+
```yaml
59+
apiVersion: apps/v1
60+
kind: Deployment
61+
metadata:
62+
name: provenance-api
63+
spec:
64+
replicas: 2
65+
selector:
66+
matchLabels:
67+
app: provenance-api
68+
template:
69+
metadata:
70+
labels:
71+
app: provenance-api
72+
spec:
73+
containers:
74+
- name: api
75+
image: your-registry/provenance:<tag>
76+
imagePullPolicy: IfNotPresent
77+
ports:
78+
- name: http
79+
containerPort: 8000
80+
envFrom:
81+
- configMapRef:
82+
name: provenance-config
83+
- secretRef:
84+
name: provenance-secrets
85+
readinessProbe:
86+
httpGet:
87+
path: /healthz
88+
port: http
89+
initialDelaySeconds: 10
90+
periodSeconds: 10
91+
livenessProbe:
92+
httpGet:
93+
path: /healthz
94+
port: http
95+
initialDelaySeconds: 30
96+
periodSeconds: 30
97+
resources:
98+
requests:
99+
cpu: 250m
100+
memory: 512Mi
101+
limits:
102+
cpu: 1
103+
memory: 1Gi
104+
```
105+
106+
- Provide Redis as a managed service (e.g., AWS Elasticache) and set `PROVENANCE_REDIS_URL` accordingly.
107+
- Mount ConfigMaps/Secrets for policy thresholds, API tokens, signing keys, and GitHub credentials.
108+
- Use a HorizontalPodAutoscaler to scale API pods based on CPU or custom metrics.
109+
110+
### Ingress & TLS
111+
112+
- Expose the API via an ingress controller (NGINX, Traefik, ALB).
113+
- Terminate TLS at the ingress or use a service mesh (Linkerd, Istio). Ensure `PROVENANCE_SERVICE_BASE_URL` matches the external HTTPS endpoint.
114+
115+
## Scaling Considerations
116+
117+
- **Detector Throughput** – Detector execution happens synchronously per request. Increase pod count to parallelize analyses, or shard workflows by repo/team. Monitoring request latency via Prometheus helps identify bottlenecks.
118+
- **Redis Capacity** – Tune persistence and memory policy. For large analyses, configure snapshotting and `maxmemory-policy` (e.g., `volatile-lru`) to avoid eviction of hot keys.
119+
- **Background Tasks** – FastAPI `BackgroundTasks` are used for asynchronous operations (analytics writes). Ensure pods have enough CPU headroom to handle background work without delaying responses.
120+
- **Analytics Warehouse** – When using ClickHouse/Snowflake/BigQuery, provision connectivity (service accounts, network policies) and monitor ingest failure logs.
121+
122+
## Observability
123+
124+
- Enable Prometheus exporter by installing the `opentelemetry-exporter-prometheus` package and setting `PROVENANCE_OTEL_ENABLED=true`, `PROVENANCE_OTEL_EXPORTER=prometheus`.
125+
- Scrape `/metrics` and create alerts on:
126+
- Request latency (P95 > SLO).
127+
- Detector capability mismatches.
128+
- Decision outcome imbalance (e.g., spike in `block`).
129+
- For OTLP, configure `PROVENANCE_OTEL_ENDPOINT` and deploy a collector.
130+
131+
## Secrets Management
132+
133+
- Store API tokens, signing keys, and GitHub credentials in Kubernetes Secrets, HashiCorp Vault, AWS Secrets Manager, etc.
134+
- Encode Ed25519 signing keys in base64 before storing (matches app expectations).
135+
- Rotate secrets regularly and redeploy pods to refresh environment variables.
136+
137+
## Disaster Recovery
138+
139+
- Redis is the system of record for analyses. Enable AOF/RDB snapshots and backup to durable storage.
140+
- Export DSSE decision bundles to long-term storage (S3, GCS) via CI to preserve audit trails.
141+
- For analytics warehouses, rely on built-in backups; events can always be regenerated by replaying DSSE bundles and analysis inputs if needed.
142+
143+
## Deployment Checklist
144+
145+
1. Configure `PROVENANCE_*` variables (see [Configuration Reference](configuration.md)).
146+
2. Provision Redis with sufficient memory and persistence.
147+
3. Deploy API (Docker/K8s) with health checks and readiness probes.
148+
4. Configure ingress/TLS and update `PROVENANCE_SERVICE_BASE_URL`.
149+
5. Wire CI to submit analyses (see [CI Integration Guide](ci-integration.md)).
150+
6. Enable observability exporters and set up dashboards/alerts.
151+
7. Archive DSSE bundles and SARIF outputs for compliance/audits.

0 commit comments

Comments
 (0)