|
| 1 | +# CI Integration Guide |
| 2 | + |
| 3 | +This guide walks through wiring Provenance into continuous integration systems so every pull request is analyzed, decisions are enforced automatically, and evidence is archived for future audits. |
| 4 | + |
| 5 | +## Prerequisites |
| 6 | + |
| 7 | +- Provenance API endpoint (e.g. `https://provenance.example.com`). |
| 8 | +- API token with permission to create analyses and read decisions. |
| 9 | +- Python 3.12 runtime available to the pipeline (the bundled GitHub Action installs [uv](https://docs.astral.sh/uv/latest/) and reuses this repository's dependency lockfile). |
| 10 | +- Optional: Ed25519 public key published via `PROVENANCE_DECISION_VERIFY_KEY` if your governance service signs DSSE bundles. |
| 11 | + |
| 12 | +## GitHub Actions |
| 13 | + |
| 14 | +We ship a composite action in `clients/github-action/` that: |
| 15 | + |
| 16 | +1. Collects the unified diff between the PR base and head commits. |
| 17 | +2. Submits the diff plus provenance metadata to `/v1/analysis`. |
| 18 | +3. Polls `/v1/analysis/{id}` until the analysis completes. |
| 19 | +4. Prints the structured decision payload for reviewer visibility. |
| 20 | +5. Exits with a non-zero status when the policy outcome is `block`. |
| 21 | + |
| 22 | +### Example Workflow |
| 23 | + |
| 24 | +Save the following as `.github/workflows/provenance.yml` and provide the API configuration through GitHub secrets: |
| 25 | + |
| 26 | +```yaml |
| 27 | +name: Provenance Governance |
| 28 | + |
| 29 | +on: |
| 30 | + pull_request: |
| 31 | + types: [opened, synchronize, reopened] |
| 32 | + |
| 33 | +jobs: |
| 34 | + analyze: |
| 35 | + runs-on: ubuntu-latest |
| 36 | + steps: |
| 37 | + - uses: actions/checkout@v4 |
| 38 | + with: |
| 39 | + # Ensure the full history is present for an accurate diff. |
| 40 | + fetch-depth: 0 |
| 41 | + |
| 42 | + - name: Run Provenance analysis |
| 43 | + uses: ./clients/github-action |
| 44 | + with: |
| 45 | + api_url: ${{ secrets.PROVENANCE_API_URL }} |
| 46 | + api_token: ${{ secrets.PROVENANCE_API_TOKEN }} |
| 47 | +``` |
| 48 | +
|
| 49 | +When a decision is `block`, the job fails and the PR is marked red. `allow` and `warn` outcomes complete successfully; governance context is still attached to the run log for reviewer triage. |
| 50 | + |
| 51 | +### Exposing SARIF Findings in GitHub |
| 52 | + |
| 53 | +The analysis API now exposes a SARIF 2.1.0 representation of each run. Add a follow-up step to fetch the SARIF payload and upload it to the GitHub code scanning UI: |
| 54 | + |
| 55 | +```yaml |
| 56 | + - name: Download SARIF report |
| 57 | + if: success() || failure() |
| 58 | + run: | |
| 59 | + set -euo pipefail |
| 60 | + ANALYSIS_ID=$(jq -r '.analysis_id' provenance.json) |
| 61 | + curl -sSf -H "Authorization: Bearer ${{ secrets.PROVENANCE_API_TOKEN }}" \ |
| 62 | + "$${{ secrets.PROVENANCE_API_URL }}/v1/analysis/${ANALYSIS_ID}/sarif" \ |
| 63 | + -o provenance.sarif |
| 64 | +
|
| 65 | + - name: Upload SARIF to GitHub |
| 66 | + if: success() || failure() |
| 67 | + uses: github/codeql-action/upload-sarif@v3 |
| 68 | + with: |
| 69 | + sarif_file: provenance.sarif |
| 70 | +``` |
| 71 | + |
| 72 | +To make the SARIF payload available, update the action invocation to persist the API response JSON: |
| 73 | + |
| 74 | +```yaml |
| 75 | + - name: Run Provenance analysis |
| 76 | + uses: ./clients/github-action |
| 77 | + with: |
| 78 | + api_url: ${{ secrets.PROVENANCE_API_URL }} |
| 79 | + api_token: ${{ secrets.PROVENANCE_API_TOKEN }} |
| 80 | + env: |
| 81 | + PROVENANCE_WRITE_RESPONSE_PATH: provenance.json |
| 82 | +``` |
| 83 | + |
| 84 | +The `clients/github-action/run.py` script respects `PROVENANCE_WRITE_RESPONSE_PATH` and mirrors the latest decision payload to disk so downstream steps can reference the analysis identifier without re-polling the API. |
| 85 | + |
| 86 | +### Archiving DSSE Decision Bundles |
| 87 | + |
| 88 | +Signed DSSE envelopes provide tamper-evident evidence for pipeline attestations. Attach the bundle to the workflow artifacts: |
| 89 | + |
| 90 | +```yaml |
| 91 | + - name: Archive decision bundle |
| 92 | + if: success() || failure() |
| 93 | + run: | |
| 94 | + set -euo pipefail |
| 95 | + ANALYSIS_ID=$(jq -r '.analysis_id' provenance.json) |
| 96 | + curl -sSf -H "Authorization: Bearer ${{ secrets.PROVENANCE_API_TOKEN }}" \ |
| 97 | + "$${{ secrets.PROVENANCE_API_URL }}/v1/analysis/${ANALYSIS_ID}/bundle" \ |
| 98 | + -o decision-bundle.json |
| 99 | +
|
| 100 | + - uses: actions/upload-artifact@v4 |
| 101 | + if: success() || failure() |
| 102 | + with: |
| 103 | + name: provenance-decision-bundle |
| 104 | + path: decision-bundle.json |
| 105 | + retention-days: 30 |
| 106 | +``` |
| 107 | + |
| 108 | +Auditors can later verify the payload hash and (if configured) Ed25519 signature against the published governance verification key. |
| 109 | + |
| 110 | +## Other CI Systems |
| 111 | + |
| 112 | +The workflow runner is a thin wrapper around four HTTP calls, so porting the integration to other CI providers is straightforward. |
| 113 | + |
| 114 | +1. Generate the diff for the change under review. For example, in Jenkins: |
| 115 | + |
| 116 | + ```bash |
| 117 | + git fetch origin "${CHANGE_TARGET}" |
| 118 | + git diff --unified=0 "origin/${CHANGE_TARGET}...${GIT_COMMIT}" > diff.patch |
| 119 | + ``` |
| 120 | + |
| 121 | +2. Convert the diff to the `changed_lines` payload expected by `/v1/analysis`. You can reuse `clients/github-action/run.py` directly (`python -m clients.github-action.run ...`) or craft JSON with a custom script. |
| 122 | + |
| 123 | +3. Submit the payload: |
| 124 | + |
| 125 | + ```bash |
| 126 | + curl -sSf -H "Authorization: Bearer ${PROVENANCE_API_TOKEN}" \ |
| 127 | + -H "Content-Type: application/json" \ |
| 128 | + -d "@payload.json" \ |
| 129 | + "${PROVENANCE_API_URL}/v1/analysis" |
| 130 | + ``` |
| 131 | + |
| 132 | +4. Poll `/v1/analysis/{id}` until `status` is `completed`; enforce `decision.outcome == "block"` to fail the job. |
| 133 | + |
| 134 | +5. Optionally fetch `/v1/analysis/{id}/sarif` and `/v1/analysis/{id}/bundle` to integrate with downstream scanners or evidence stores. |
| 135 | + |
| 136 | +### Containerized Stages |
| 137 | + |
| 138 | +If your CI stages run in disposable containers: |
| 139 | + |
| 140 | +- Install `uv` (or `pip`) to execute `clients/github-action/run.py`. |
| 141 | +- Mount the repository workspace so the diff generator can inspect tracked files. |
| 142 | +- Provide `PROVENANCE_API_URL` and `PROVENANCE_API_TOKEN` via environment variables or secrets injection. |
| 143 | +- Persist the JSON response to disk if later stages depend on the analysis identifier. |
| 144 | + |
| 145 | +## Debugging Tips |
| 146 | + |
| 147 | +- The composite action logs the raw decision payload; review the `risk_summary` and `decision.rationale` fields when a run blocks unexpectedly. |
| 148 | +- Use the `PROVENANCE_TRACE=1` environment variable to enable verbose HTTP logging inside the action script. |
| 149 | +- When testing locally, run `uv run clients/github-action/run.py --help` to see available arguments. |
| 150 | +- Double-check that `fetch-depth: 0` (or an equivalent full clone) is configured; shallow clones omit base commits, leading to empty diffs and analyses that no-op. |
| 151 | +- If polling times out, inspect the Provenance server logs for long-running detectors or governance evaluations; consider extending the `--timeout-s` flag in the CLI. |
0 commit comments