[Audit] #26 coding-agent-completed

## Attestation Identity

| Field | Value |
|---|---|
| **Slot name** | `coding-agent-completed` |
| **Flow** | Build (e.g. `agentic-sdlc-demo-GH{issue}-Build`) |
| **Level** | Trail-level |
| **Kosli type** | `custom:coding-agent-result` |
| **Schema** | `kosli/attestation-types/schemas/coding-agent-result.json` |
| **jq evaluator rules** | `.success == true`, `.budget_exceeded == false`, `(.commit_sha \| length) >= 7`, `(.files_changed \| length) > 0` |
| **Instances per trail** | 1 |

## Control Objective

### Risk Mitigated

In a regulated financial services context, the absence of this control exposes the organization to:

1. **Unverified autonomous code entering the build pipeline.** Without attesting the coding agent's result, there is no gate confirming that the AI-generated implementation actually compiles, passes lint, and passes tests before downstream CI and review begin. A broken or budget-blown agent run could silently propagate into the build trail.
2. **Unbounded agent cost exposure.** An autonomous coding agent operating without a budget check could consume unlimited API credits. The `budget_exceeded == false` rule prevents a runaway agent from being treated as successful, enforcing fiscal discipline on AI compute.
3. **Phantom or empty changes.** The `(.files_changed | length) > 0` rule ensures the agent actually produced code modifications. Without it, a no-op agent run (e.g., due to prompt confusion or model refusal) could be silently marked as successful, creating a trail with no meaningful artifact.
4. **Unattributed code provenance.** The `(.commit_sha | length) >= 7` rule binds the attestation to a specific Git commit, establishing chain of custody from AI-generated code to the build artifact. Without it, the trail cannot prove which code was built.

### Auditor Question

> "Can you demonstrate, with structured evidence, that every AI-generated code change entering your build pipeline was verified to pass lint and tests, stayed within its cost budget, produced identifiable file modifications, and is bound to a specific Git commit — all before any downstream CI or review activities began?"

### Regulatory Mapping

| Framework | Control Reference | Relevance |
|---|---|---|
| **SOC 2** | CC6.1 — Logical and Physical Access Controls | Ensures only verified, successful agent output enters build pipeline |
| **SOC 2** | CC8.1 — Change Management | Structured evidence that AI-generated changes were validated before build |
| **SOC 2** | CC6.8 — Unauthorized Software Prevention | Budget check prevents runaway agent from injecting unlimited code iterations |
| **ISO 27001** | A.8.25 — Secure Development Lifecycle | Validates AI coding agent output before it enters the review/build lifecycle |
| **ISO 27001** | A.8.32 — Change Management | Commit SHA binding provides change traceability |
| **NIST CSF** | PR.DS-6 — Integrity checking mechanisms | Commit SHA and file list establish integrity of agent output |
| **NIST CSF** | PR.IP-3 — Configuration change control | Agent result attestation is a change control checkpoint |
| **NIST CSF** | DE.CM-4 — Malicious code detection | Budget and success checks detect anomalous agent behavior |
| **PCI DSS** | 6.5.6 — Secure coding practices | Lint and test validation of AI-generated code |

## Evidence Specification

### Attestation Payload Fields

| Field | Type | Source | Required | Validated by |
|---|---|---|---|---|
| `success` | boolean | `coding-result-GH{N}.json` → `.success` | Yes (schema + jq) | jq: `.success == true` |
| `issue_number` | integer | `coding-result-GH{N}.json` → `.issue_number` or `$ISSUE_NUMBER` env | Yes (schema) | Schema: `minimum: 1` |
| `issue_id` | string | `$ISSUE_ID` env var (e.g. `GH123`) | Yes (schema) | Schema: `minLength: 1` |
| `branch` | string | `coding-result-GH{N}.json` → `.branch` | Yes (schema) | Schema: `minLength: 1` |
| `commit_sha` | string | `coding-result-GH{N}.json` → `.commit_sha` | Yes (schema + jq) | jq: `(.commit_sha \| length) >= 7`; Schema: `minLength: 7` |
| `files_changed` | array[string] | `coding-result-GH{N}.json` → `.files_changed` | Yes (schema + jq) | jq: `(.files_changed \| length) > 0` |
| `lint_passed` | boolean | `coding-result-GH{N}.json` → `.lint_passed` | Yes (schema) | Schema only (not in jq rules) |
| `tests_passed` | boolean | `coding-result-GH{N}.json` → `.tests_passed` | Yes (schema) | Schema only (not in jq rules) |
| `claude_turns` | integer | `coding-result-GH{N}.json` → `.claude_turns` | No | Schema: `minimum: 0` |
| `claude_cost_usd` | number | `coding-result-GH{N}.json` → `.claude_cost` | No | Schema: `minimum: 0` |
| `max_cost_budget_usd` | number | `coding-result-GH{N}.json` → `.max_cost_budget` (default 15.0) | No | Schema: `minimum: 0` |
| `budget_exceeded` | boolean | `coding-result-GH{N}.json` → `.budget_exceeded` | Yes (schema + jq) | jq: `.budget_exceeded == false` |
| `duration_ms` | integer | `coding-result-GH{N}.json` → `.duration_ms` | No | Schema: `minimum: 0` |
| `retries` | integer | `coding-result-GH{N}.json` → `.retries` | No | Schema: `minimum: 0` |
| `error` | string | `coding-result-GH{N}.json` → error message if failed | No | Schema only |
| `timestamp` | string (ISO 8601) | `datetime.now(timezone.utc).isoformat()` at attestation time | Yes (schema) | Schema: `format: date-time` |

### Producer Metadata

- **Evidence producer:** GitHub Actions runner (`ubuntu-latest`), `finalize` job in `.github/workflows/agentic-code.yml`, step "Attest coding-agent-completed" (lines 375-431)
- **Producer trust level:** Ephemeral CI runner. The payload is assembled from `coding-result-GH{N}.json` which is an artifact uploaded by the `code` job and downloaded via `actions/download-artifact@v4`. The Python inline script reads the JSON and constructs the payload. Kosli CLI installed via `kosli-dev/setup-cli-action@v2` with pinned version.
- **Tamper resistance:** The coding result JSON is produced by the `code` job on a separate runner, uploaded as a GitHub Actions artifact, and downloaded in the `finalize` job. Cross-job artifact transfer provides some isolation but relies on GitHub Actions artifact integrity. The attestation is immutable once written to Kosli.

## Compliance Logic

### Compliant when

ALL four jq rules return true (AND-gated):

1. `.success == true` — The coding agent completed successfully (lint + tests passed after agent finished)
2. `.budget_exceeded == false` — The agent did not exceed its cost budget (`CODING_MAX_COST`, default $15.00)
3. `(.commit_sha | length) >= 7` — A valid Git commit SHA is present (at least 7 characters, the minimum for a short SHA)
4. `(.files_changed | length) > 0` — The agent actually modified at least one file

Additionally, the JSON schema enforces:
- All required fields are present (`success`, `issue_number`, `issue_id`, `branch`, `commit_sha`, `files_changed`, `lint_passed`, `tests_passed`, `budget_exceeded`, `timestamp`)
- Type constraints (booleans are booleans, integers have minimums, strings have minimum lengths)

### Non-compliant when

- Agent failed (success=false) — lint or tests did not pass after retries
- Agent exceeded budget (budget_exceeded=true) — cost overrun
- No commit SHA or SHA too short — agent did not produce a valid commit
- No files changed — agent ran but made no modifications
- Missing required fields — schema validation fails
- Wrong field types — schema validation fails

### Threshold source

- **jq rules:** Defined in `kosli/attestation-types/setup.sh` lines 271-278
- **Schema:** `kosli/attestation-types/schemas/coding-agent-result.json`
- **Budget limit:** `CODING_MAX_COST` env var in workflow (default `15.0`), checked by the coding agent itself
- **No Rego policy:** This is a direct custom type attestation, not a Rego-evaluated control

## Edge Cases & Failure Modes

### Check Cannot Run

**Coding result file missing:** If `coding-result-GH{N}.json` does not exist, the attestation step prints a warning and exits with code 0 (line 384-387). The attestation slot remains unfilled, making the trail non-compliant. This is safe (fail-open on attestation, fail-closed on trail) but produces no diagnostic error in the Kosli trail — the slot is simply absent.

**Artifact download fails:** `actions/download-artifact@v4` has `continue-on-error: true` (line 373). If the code job did not produce a result file (crash, timeout), the download silently fails, and the missing-file guard above triggers.

**Kosli API unavailable:** The `kosli attest custom` call (lines 423-431) has no `|| true` wrapper. If Kosli is down, the step fails hard, which is correct behavior — the trail cannot record the attestation without Kosli.

**Trail not begun:** If the "Create Kosli build flow and begin trail" step (lines 335-365) failed, `KOSLI_BUILD_FLOW` and `KOSLI_BUILD_TRAIL` env vars are empty. The `kosli attest custom` call will fail with an API error.

### Partial Evidence

If the coding agent crashes mid-run and writes a partial `coding-result-GH{N}.json` (e.g., missing `files_changed`), the Python inline script will default missing fields to empty values (`[]` for files_changed, `False` for success). The jq rules will then correctly mark it non-compliant because `(.files_changed | length) > 0` will fail.

If the agent succeeds at coding but the result JSON has `success: true` while `lint_passed: false` (inconsistent state), the jq rules will still pass because they only check `.success`, not `.lint_passed` directly. This is a gap — see Assessment.

### Bypass Vectors

| Vector | Severity | Details |
|---|---|---|
| **Fabricated result JSON** | HIGH | The coding agent writes `coding-result-GH{N}.json` itself. A compromised agent or modified `scripts.coding.agent` module could write `success: true` regardless of actual lint/test results. The jq rules trust the self-reported payload. |
| **Shared KOSLI_API_TOKEN** | HIGH | Anyone with repo write access can use the token to attest arbitrary payloads to any flow/trail. No flow-scoped RBAC prevents a malicious actor from directly calling `kosli attest custom` with a forged payload. |
| **Artifact upload/download substitution** | MEDIUM | GitHub Actions artifacts are scoped to the workflow run, but a compromised `code` job could upload a falsified result file. The `finalize` job has no way to verify the authenticity of the downloaded artifact beyond trusting the Actions runtime. |
| **`|| true` on agent invocation** | MEDIUM | Line 86: `python -m scripts.coding.agent ... \|\| true`. The agent is allowed to fail silently. If it crashes and writes a partial result with `success: true` before crashing, that partial result is uploaded and attested. |
| **Timestamp is attestation-time, not agent-time** | LOW | The `timestamp` field is set when the payload is assembled in `finalize`, not when the agent actually ran. There could be a significant time gap. |
| **`lint_passed` and `tests_passed` not in jq rules** | MEDIUM | These fields are in the schema but not enforced by jq. An agent could report `success: true, lint_passed: false, tests_passed: false` and the attestation would be compliant. |

### False Positive Risk

- **Self-reported success:** The agent marks itself successful. If the agent's internal lint/test verification is buggy or incomplete, the attestation may be compliant despite real failures. The downstream `lint-control` and `unit-test-control` attestations (on the artifact) partially mitigate this by re-running checks in CI.

### False Negative Risk

- **Transient test flakiness:** If the agent's local test run encounters a flaky test failure, `success` will be false and the attestation non-compliant, even though the code is correct. The retry mechanism (`retries` field) partially mitigates this.
- **Budget threshold too low:** If `CODING_MAX_COST` is set too aggressively for complex issues, legitimate work may be flagged as budget-exceeded.

### TOCTOU Gaps

- **Code changes after agent completes:** The `code` job pushes the feature branch (line 100-105) after the coding agent runs. The `commit_sha` in the result file refers to the agent's last commit. If any subsequent step in the `code` job modifies files (none currently do), the attested SHA would be stale.
- **Review job may push fixes:** The `review` job's resolver can push additional commits (line 228-229). The `coding-agent-completed` attestation records the pre-review SHA. This is by design — the coding agent's result is attested separately from review-time fixes.

### Single Points of Failure

1. **`coding-result-GH{N}.json`** — The entire attestation depends on this single file. If it is corrupted, truncated, or absent, no attestation is recorded.
2. **Self-reported evidence** — The coding agent is both the executor and the evidence producer. No independent verification of lint/test results at attestation time.
3. **GitHub Actions artifact transfer** — Cross-job evidence depends on the Actions artifact system. No checksum or signature verification on download.

## Dependencies

### Upstream

| Dependency | Type | Provides |
|---|---|---|
| `code` job | CI job | `coding-result-GH{N}.json` artifact, `branch`, `success`, `issue_id` outputs |
| `review` job | CI job | `verdict` output (must be `ACCEPTED` for attestation to run) |
| `scripts.coding.agent` | Python module | Produces the coding result JSON |
| `actions/upload-artifact@v4` | GitHub Action | Transfers result file from `code` to `finalize` job |
| `actions/download-artifact@v4` | GitHub Action | Retrieves result file in `finalize` job |
| Kosli build flow + trail | Kosli | Must be created before attestation (step "Create Kosli build flow and begin trail") |
| `KOSLI_API_TOKEN` | Secret | Authentication for Kosli API |
| `ISSUE_NUMBER`, `ISSUE_ID` | Env vars | Issue identification for file naming and payload |

### Downstream

| Consumer | Impact of Failure |
|---|---|
| Kosli trail compliance | Unfilled slot → trail NON-COMPLIANT (entire build trail blocked) |
| CI workflow dispatch | Triggered after this attestation (line 433-454); runs regardless of attestation outcome |
| Artifact-level attestations | Not directly dependent, but share the same trail |
| `code-review-control` | Indirectly depends on coding agent having produced a valid commit |

### Cross-Flow Bridge

1. **Code job** (same workflow) produces `coding-result-GH{N}.json` with agent outcome, cost, and commit SHA
2. **Finalize job** downloads the result, constructs a schema-compliant payload, and attests to the Build flow trail
3. **Build trail** requires this slot filled and compliant before the trail can be marked compliant
4. The `commit_sha` in this attestation is the same commit that the review flow reviewed and that CI will build — establishing provenance linkage across flows

## Assessment

### Implementation Match: PARTIAL

The attestation correctly captures the coding agent's outcome and enforces the four key invariants (success, budget, commit, files). The schema is well-designed with appropriate types and constraints. However, the implementation has a structural weakness: `lint_passed` and `tests_passed` are required schema fields but are NOT enforced by jq rules. The `success` field is a self-reported composite that is supposed to reflect lint+test outcomes, but there is no independent verification at attestation time.

### Evidence Sufficiency: ADEQUATE WITH GAPS

The evidence is sufficient for most audit scenarios — the payload captures what the agent did, how much it cost, and what it produced. Gaps include:
1. No independent verification of self-reported fields (success, lint_passed, tests_passed)
2. Timestamp is attestation-time, not agent-completion-time (temporal accuracy gap)
3. No model identity in jq rules (which Claude model version produced the code)
4. No diff size or complexity metrics (how much code was generated)
5. Cost fields (`claude_cost_usd`, `max_cost_budget_usd`) are optional in schema despite `budget_exceeded` being required — allows attesting budget compliance without disclosing the actual cost

### Gaps

1. **HIGH: Self-reported success is the sole compliance signal.** The jq rules check `.success == true` but this value is written by the coding agent itself. A buggy or adversarial agent can write `success: true` regardless of actual lint/test outcomes. Mitigation: downstream CI re-runs lint and tests, but by that point the attestation is already compliant.
2. **HIGH: `lint_passed` and `tests_passed` not enforced by jq rules.** These are required schema fields but have no compliance impact. An attestation with `success: true, lint_passed: false` would be compliant. Add jq rules: `.lint_passed == true` and `.tests_passed == true`.
3. **HIGH: Shared KOSLI_API_TOKEN allows forged attestations.** Any workflow or actor with the token can attest arbitrary payloads. No flow-scoped or trail-scoped access control.
4. **MEDIUM: Silent exit on missing result file.** The step exits 0 when the result file is missing (line 384-387), meaning the workflow succeeds but the attestation is simply absent. While the trail remains non-compliant, there is no explicit failure signal in the CI run.
5. **MEDIUM: `|| true` on agent invocation masks crashes.** The coding agent can crash and still produce a partial result that gets attested. Remove `|| true` or validate result file completeness.
6. **LOW: Timestamp mismatch.** The attestation timestamp is set at payload construction time, not agent completion time. Add `agent_completed_at` to the payload.
7. **LOW: No model version in jq rules.** The schema does not require `model` or `model_version` fields. The workflow hardcodes `CODING_MODEL: claude-sonnet-4-20250514` but this is not recorded in the attestation.

### Recommendations

1. **Add jq rules for `lint_passed` and `tests_passed`:** `.lint_passed == true` and `.tests_passed == true` — these fields are already required by schema but have no compliance enforcement.
2. **Add result file validation:** Before constructing the payload, validate that the result JSON contains all required fields with non-default values. Fail the step (non-zero exit) if validation fails.
3. **Investigate flow-scoped KOSLI_API_TOKEN** or separate tokens for different flows to prevent cross-flow attestation forgery.
4. **Remove `|| true` from agent invocation** or add explicit post-run validation that the result file is complete and internally consistent.
5. **Add `model_version` to required schema fields and jq rules** to establish model provenance.
6. **Record actual cost values** as required fields (not optional) to support cost auditing.
7. **Add a checksum** to the uploaded artifact to detect tampering during cross-job transfer.

### Verdict: NEEDS IMPROVEMENT

The attestation covers the right conceptual ground — it gates on agent success, budget compliance, commit provenance, and file modification. The schema design is solid. However, the self-reported nature of the evidence, the gap between schema requirements and jq enforcement (lint_passed/tests_passed), and the shared API token create meaningful compliance and security risks. The downstream CI checks partially compensate, but an auditor would flag the self-attestation pattern and the unenforced schema fields as control weaknesses requiring remediation.

Field	Value
Slot name	`coding-agent-completed`
Flow	Build (e.g. `agentic-sdlc-demo-GH{issue}-Build`)
Level	Trail-level
Kosli type	`custom:coding-agent-result`
Schema	`kosli/attestation-types/schemas/coding-agent-result.json`
jq evaluator rules	`.success == true`, `.budget_exceeded == false`, `(.commit_sha \| length) >= 7`, `(.files_changed \| length) > 0`
Instances per trail	1

Field	Type	Source	Required	Validated by
`success`	boolean	`coding-result-GH{N}.json` → `.success`	Yes (schema + jq)	jq: `.success == true`
`issue_number`	integer	`coding-result-GH{N}.json` → `.issue_number` or `$ISSUE_NUMBER` env	Yes (schema)	Schema: `minimum: 1`
`issue_id`	string	`$ISSUE_ID` env var (e.g. `GH123`)	Yes (schema)	Schema: `minLength: 1`
`branch`	string	`coding-result-GH{N}.json` → `.branch`	Yes (schema)	Schema: `minLength: 1`
`commit_sha`	string	`coding-result-GH{N}.json` → `.commit_sha`	Yes (schema + jq)	jq: `(.commit_sha \| length) >= 7`; Schema: `minLength: 7`
`files_changed`	array[string]	`coding-result-GH{N}.json` → `.files_changed`	Yes (schema + jq)	jq: `(.files_changed \| length) > 0`
`lint_passed`	boolean	`coding-result-GH{N}.json` → `.lint_passed`	Yes (schema)	Schema only (not in jq rules)
`tests_passed`	boolean	`coding-result-GH{N}.json` → `.tests_passed`	Yes (schema)	Schema only (not in jq rules)
`claude_turns`	integer	`coding-result-GH{N}.json` → `.claude_turns`	No	Schema: `minimum: 0`
`claude_cost_usd`	number	`coding-result-GH{N}.json` → `.claude_cost`	No	Schema: `minimum: 0`
`max_cost_budget_usd`	number	`coding-result-GH{N}.json` → `.max_cost_budget` (default 15.0)	No	Schema: `minimum: 0`
`budget_exceeded`	boolean	`coding-result-GH{N}.json` → `.budget_exceeded`	Yes (schema + jq)	jq: `.budget_exceeded == false`
`duration_ms`	integer	`coding-result-GH{N}.json` → `.duration_ms`	No	Schema: `minimum: 0`
`retries`	integer	`coding-result-GH{N}.json` → `.retries`	No	Schema: `minimum: 0`
`error`	string	`coding-result-GH{N}.json` → error message if failed	No	Schema only
`timestamp`	string (ISO 8601)	`datetime.now(timezone.utc).isoformat()` at attestation time	Yes (schema)	Schema: `format: date-time`

Framework	Control Reference	Relevance
SOC 2	CC6.1 — Logical and Physical Access Controls	Ensures only verified, successful agent output enters build pipeline
SOC 2	CC8.1 — Change Management	Structured evidence that AI-generated changes were validated before build
SOC 2	CC6.8 — Unauthorized Software Prevention	Budget check prevents runaway agent from injecting unlimited code iterations
ISO 27001	A.8.25 — Secure Development Lifecycle	Validates AI coding agent output before it enters the review/build lifecycle
ISO 27001	A.8.32 — Change Management	Commit SHA binding provides change traceability
NIST CSF	PR.DS-6 — Integrity checking mechanisms	Commit SHA and file list establish integrity of agent output
NIST CSF	PR.IP-3 — Configuration change control	Agent result attestation is a change control checkpoint
NIST CSF	DE.CM-4 — Malicious code detection	Budget and success checks detect anomalous agent behavior
PCI DSS	6.5.6 — Secure coding practices	Lint and test validation of AI-generated code

Vector	Severity	Details
Fabricated result JSON	HIGH	The coding agent writes `coding-result-GH{N}.json` itself. A compromised agent or modified `scripts.coding.agent` module could write `success: true` regardless of actual lint/test results. The jq rules trust the self-reported payload.
Shared KOSLI_API_TOKEN	HIGH	Anyone with repo write access can use the token to attest arbitrary payloads to any flow/trail. No flow-scoped RBAC prevents a malicious actor from directly calling `kosli attest custom` with a forged payload.
Artifact upload/download substitution	MEDIUM	GitHub Actions artifacts are scoped to the workflow run, but a compromised `code` job could upload a falsified result file. The `finalize` job has no way to verify the authenticity of the downloaded artifact beyond trusting the Actions runtime.
**`		true` on agent invocation**
Timestamp is attestation-time, not agent-time	LOW	The `timestamp` field is set when the payload is assembled in `finalize`, not when the agent actually ran. There could be a significant time gap.
`lint_passed` and `tests_passed` not in jq rules	MEDIUM	These fields are in the schema but not enforced by jq. An agent could report `success: true, lint_passed: false, tests_passed: false` and the attestation would be compliant.

Dependency	Type	Provides
`code` job	CI job	`coding-result-GH{N}.json` artifact, `branch`, `success`, `issue_id` outputs
`review` job	CI job	`verdict` output (must be `ACCEPTED` for attestation to run)
`scripts.coding.agent`	Python module	Produces the coding result JSON
`actions/upload-artifact@v4`	GitHub Action	Transfers result file from `code` to `finalize` job
`actions/download-artifact@v4`	GitHub Action	Retrieves result file in `finalize` job
Kosli build flow + trail	Kosli	Must be created before attestation (step "Create Kosli build flow and begin trail")
`KOSLI_API_TOKEN`	Secret	Authentication for Kosli API
`ISSUE_NUMBER`, `ISSUE_ID`	Env vars	Issue identification for file naming and payload

Consumer	Impact of Failure
Kosli trail compliance	Unfilled slot → trail NON-COMPLIANT (entire build trail blocked)
CI workflow dispatch	Triggered after this attestation (line 433-454); runs regardless of attestation outcome
Artifact-level attestations	Not directly dependent, but share the same trail
`code-review-control`	Indirectly depends on coding agent having produced a valid commit

[Audit] #26 coding-agent-completed #127

Description

Attestation Identity

Control Objective

Risk Mitigated

Auditor Question

Regulatory Mapping

Evidence Specification

Attestation Payload Fields

Producer Metadata

Compliance Logic

Compliant when

Non-compliant when

Threshold source

Edge Cases & Failure Modes

Check Cannot Run

Partial Evidence

Bypass Vectors

False Positive Risk

False Negative Risk

TOCTOU Gaps

Single Points of Failure

Dependencies

Upstream

Downstream

Cross-Flow Bridge

Assessment

Implementation Match: PARTIAL

Evidence Sufficiency: ADEQUATE WITH GAPS

Gaps

Recommendations

Verdict: NEEDS IMPROVEMENT

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions