Skip to content

[Audit] #26 coding-agent-completed #127

@AlexKantor87

Description

@AlexKantor87

Attestation Identity

Field Value
Slot name coding-agent-completed
Flow Build (e.g. agentic-sdlc-demo-GH{issue}-Build)
Level Trail-level
Kosli type custom:coding-agent-result
Schema kosli/attestation-types/schemas/coding-agent-result.json
jq evaluator rules .success == true, .budget_exceeded == false, (.commit_sha | length) >= 7, (.files_changed | length) > 0
Instances per trail 1

Control Objective

Risk Mitigated

In a regulated financial services context, the absence of this control exposes the organization to:

  1. Unverified autonomous code entering the build pipeline. Without attesting the coding agent's result, there is no gate confirming that the AI-generated implementation actually compiles, passes lint, and passes tests before downstream CI and review begin. A broken or budget-blown agent run could silently propagate into the build trail.
  2. Unbounded agent cost exposure. An autonomous coding agent operating without a budget check could consume unlimited API credits. The budget_exceeded == false rule prevents a runaway agent from being treated as successful, enforcing fiscal discipline on AI compute.
  3. Phantom or empty changes. The (.files_changed | length) > 0 rule ensures the agent actually produced code modifications. Without it, a no-op agent run (e.g., due to prompt confusion or model refusal) could be silently marked as successful, creating a trail with no meaningful artifact.
  4. Unattributed code provenance. The (.commit_sha | length) >= 7 rule binds the attestation to a specific Git commit, establishing chain of custody from AI-generated code to the build artifact. Without it, the trail cannot prove which code was built.

Auditor Question

"Can you demonstrate, with structured evidence, that every AI-generated code change entering your build pipeline was verified to pass lint and tests, stayed within its cost budget, produced identifiable file modifications, and is bound to a specific Git commit — all before any downstream CI or review activities began?"

Regulatory Mapping

Framework Control Reference Relevance
SOC 2 CC6.1 — Logical and Physical Access Controls Ensures only verified, successful agent output enters build pipeline
SOC 2 CC8.1 — Change Management Structured evidence that AI-generated changes were validated before build
SOC 2 CC6.8 — Unauthorized Software Prevention Budget check prevents runaway agent from injecting unlimited code iterations
ISO 27001 A.8.25 — Secure Development Lifecycle Validates AI coding agent output before it enters the review/build lifecycle
ISO 27001 A.8.32 — Change Management Commit SHA binding provides change traceability
NIST CSF PR.DS-6 — Integrity checking mechanisms Commit SHA and file list establish integrity of agent output
NIST CSF PR.IP-3 — Configuration change control Agent result attestation is a change control checkpoint
NIST CSF DE.CM-4 — Malicious code detection Budget and success checks detect anomalous agent behavior
PCI DSS 6.5.6 — Secure coding practices Lint and test validation of AI-generated code

Evidence Specification

Attestation Payload Fields

Field Type Source Required Validated by
success boolean coding-result-GH{N}.json.success Yes (schema + jq) jq: .success == true
issue_number integer coding-result-GH{N}.json.issue_number or $ISSUE_NUMBER env Yes (schema) Schema: minimum: 1
issue_id string $ISSUE_ID env var (e.g. GH123) Yes (schema) Schema: minLength: 1
branch string coding-result-GH{N}.json.branch Yes (schema) Schema: minLength: 1
commit_sha string coding-result-GH{N}.json.commit_sha Yes (schema + jq) jq: (.commit_sha | length) >= 7; Schema: minLength: 7
files_changed array[string] coding-result-GH{N}.json.files_changed Yes (schema + jq) jq: (.files_changed | length) > 0
lint_passed boolean coding-result-GH{N}.json.lint_passed Yes (schema) Schema only (not in jq rules)
tests_passed boolean coding-result-GH{N}.json.tests_passed Yes (schema) Schema only (not in jq rules)
claude_turns integer coding-result-GH{N}.json.claude_turns No Schema: minimum: 0
claude_cost_usd number coding-result-GH{N}.json.claude_cost No Schema: minimum: 0
max_cost_budget_usd number coding-result-GH{N}.json.max_cost_budget (default 15.0) No Schema: minimum: 0
budget_exceeded boolean coding-result-GH{N}.json.budget_exceeded Yes (schema + jq) jq: .budget_exceeded == false
duration_ms integer coding-result-GH{N}.json.duration_ms No Schema: minimum: 0
retries integer coding-result-GH{N}.json.retries No Schema: minimum: 0
error string coding-result-GH{N}.json → error message if failed No Schema only
timestamp string (ISO 8601) datetime.now(timezone.utc).isoformat() at attestation time Yes (schema) Schema: format: date-time

Producer Metadata

  • Evidence producer: GitHub Actions runner (ubuntu-latest), finalize job in .github/workflows/agentic-code.yml, step "Attest coding-agent-completed" (lines 375-431)
  • Producer trust level: Ephemeral CI runner. The payload is assembled from coding-result-GH{N}.json which is an artifact uploaded by the code job and downloaded via actions/download-artifact@v4. The Python inline script reads the JSON and constructs the payload. Kosli CLI installed via kosli-dev/setup-cli-action@v2 with pinned version.
  • Tamper resistance: The coding result JSON is produced by the code job on a separate runner, uploaded as a GitHub Actions artifact, and downloaded in the finalize job. Cross-job artifact transfer provides some isolation but relies on GitHub Actions artifact integrity. The attestation is immutable once written to Kosli.

Compliance Logic

Compliant when

ALL four jq rules return true (AND-gated):

  1. .success == true — The coding agent completed successfully (lint + tests passed after agent finished)
  2. .budget_exceeded == false — The agent did not exceed its cost budget (CODING_MAX_COST, default $15.00)
  3. (.commit_sha | length) >= 7 — A valid Git commit SHA is present (at least 7 characters, the minimum for a short SHA)
  4. (.files_changed | length) > 0 — The agent actually modified at least one file

Additionally, the JSON schema enforces:

  • All required fields are present (success, issue_number, issue_id, branch, commit_sha, files_changed, lint_passed, tests_passed, budget_exceeded, timestamp)
  • Type constraints (booleans are booleans, integers have minimums, strings have minimum lengths)

Non-compliant when

  • Agent failed (success=false) — lint or tests did not pass after retries
  • Agent exceeded budget (budget_exceeded=true) — cost overrun
  • No commit SHA or SHA too short — agent did not produce a valid commit
  • No files changed — agent ran but made no modifications
  • Missing required fields — schema validation fails
  • Wrong field types — schema validation fails

Threshold source

  • jq rules: Defined in kosli/attestation-types/setup.sh lines 271-278
  • Schema: kosli/attestation-types/schemas/coding-agent-result.json
  • Budget limit: CODING_MAX_COST env var in workflow (default 15.0), checked by the coding agent itself
  • No Rego policy: This is a direct custom type attestation, not a Rego-evaluated control

Edge Cases & Failure Modes

Check Cannot Run

Coding result file missing: If coding-result-GH{N}.json does not exist, the attestation step prints a warning and exits with code 0 (line 384-387). The attestation slot remains unfilled, making the trail non-compliant. This is safe (fail-open on attestation, fail-closed on trail) but produces no diagnostic error in the Kosli trail — the slot is simply absent.

Artifact download fails: actions/download-artifact@v4 has continue-on-error: true (line 373). If the code job did not produce a result file (crash, timeout), the download silently fails, and the missing-file guard above triggers.

Kosli API unavailable: The kosli attest custom call (lines 423-431) has no || true wrapper. If Kosli is down, the step fails hard, which is correct behavior — the trail cannot record the attestation without Kosli.

Trail not begun: If the "Create Kosli build flow and begin trail" step (lines 335-365) failed, KOSLI_BUILD_FLOW and KOSLI_BUILD_TRAIL env vars are empty. The kosli attest custom call will fail with an API error.

Partial Evidence

If the coding agent crashes mid-run and writes a partial coding-result-GH{N}.json (e.g., missing files_changed), the Python inline script will default missing fields to empty values ([] for files_changed, False for success). The jq rules will then correctly mark it non-compliant because (.files_changed | length) > 0 will fail.

If the agent succeeds at coding but the result JSON has success: true while lint_passed: false (inconsistent state), the jq rules will still pass because they only check .success, not .lint_passed directly. This is a gap — see Assessment.

Bypass Vectors

Vector Severity Details
Fabricated result JSON HIGH The coding agent writes coding-result-GH{N}.json itself. A compromised agent or modified scripts.coding.agent module could write success: true regardless of actual lint/test results. The jq rules trust the self-reported payload.
Shared KOSLI_API_TOKEN HIGH Anyone with repo write access can use the token to attest arbitrary payloads to any flow/trail. No flow-scoped RBAC prevents a malicious actor from directly calling kosli attest custom with a forged payload.
Artifact upload/download substitution MEDIUM GitHub Actions artifacts are scoped to the workflow run, but a compromised code job could upload a falsified result file. The finalize job has no way to verify the authenticity of the downloaded artifact beyond trusting the Actions runtime.
**` true` on agent invocation**
Timestamp is attestation-time, not agent-time LOW The timestamp field is set when the payload is assembled in finalize, not when the agent actually ran. There could be a significant time gap.
lint_passed and tests_passed not in jq rules MEDIUM These fields are in the schema but not enforced by jq. An agent could report success: true, lint_passed: false, tests_passed: false and the attestation would be compliant.

False Positive Risk

  • Self-reported success: The agent marks itself successful. If the agent's internal lint/test verification is buggy or incomplete, the attestation may be compliant despite real failures. The downstream lint-control and unit-test-control attestations (on the artifact) partially mitigate this by re-running checks in CI.

False Negative Risk

  • Transient test flakiness: If the agent's local test run encounters a flaky test failure, success will be false and the attestation non-compliant, even though the code is correct. The retry mechanism (retries field) partially mitigates this.
  • Budget threshold too low: If CODING_MAX_COST is set too aggressively for complex issues, legitimate work may be flagged as budget-exceeded.

TOCTOU Gaps

  • Code changes after agent completes: The code job pushes the feature branch (line 100-105) after the coding agent runs. The commit_sha in the result file refers to the agent's last commit. If any subsequent step in the code job modifies files (none currently do), the attested SHA would be stale.
  • Review job may push fixes: The review job's resolver can push additional commits (line 228-229). The coding-agent-completed attestation records the pre-review SHA. This is by design — the coding agent's result is attested separately from review-time fixes.

Single Points of Failure

  1. coding-result-GH{N}.json — The entire attestation depends on this single file. If it is corrupted, truncated, or absent, no attestation is recorded.
  2. Self-reported evidence — The coding agent is both the executor and the evidence producer. No independent verification of lint/test results at attestation time.
  3. GitHub Actions artifact transfer — Cross-job evidence depends on the Actions artifact system. No checksum or signature verification on download.

Dependencies

Upstream

Dependency Type Provides
code job CI job coding-result-GH{N}.json artifact, branch, success, issue_id outputs
review job CI job verdict output (must be ACCEPTED for attestation to run)
scripts.coding.agent Python module Produces the coding result JSON
actions/upload-artifact@v4 GitHub Action Transfers result file from code to finalize job
actions/download-artifact@v4 GitHub Action Retrieves result file in finalize job
Kosli build flow + trail Kosli Must be created before attestation (step "Create Kosli build flow and begin trail")
KOSLI_API_TOKEN Secret Authentication for Kosli API
ISSUE_NUMBER, ISSUE_ID Env vars Issue identification for file naming and payload

Downstream

Consumer Impact of Failure
Kosli trail compliance Unfilled slot → trail NON-COMPLIANT (entire build trail blocked)
CI workflow dispatch Triggered after this attestation (line 433-454); runs regardless of attestation outcome
Artifact-level attestations Not directly dependent, but share the same trail
code-review-control Indirectly depends on coding agent having produced a valid commit

Cross-Flow Bridge

  1. Code job (same workflow) produces coding-result-GH{N}.json with agent outcome, cost, and commit SHA
  2. Finalize job downloads the result, constructs a schema-compliant payload, and attests to the Build flow trail
  3. Build trail requires this slot filled and compliant before the trail can be marked compliant
  4. The commit_sha in this attestation is the same commit that the review flow reviewed and that CI will build — establishing provenance linkage across flows

Assessment

Implementation Match: PARTIAL

The attestation correctly captures the coding agent's outcome and enforces the four key invariants (success, budget, commit, files). The schema is well-designed with appropriate types and constraints. However, the implementation has a structural weakness: lint_passed and tests_passed are required schema fields but are NOT enforced by jq rules. The success field is a self-reported composite that is supposed to reflect lint+test outcomes, but there is no independent verification at attestation time.

Evidence Sufficiency: ADEQUATE WITH GAPS

The evidence is sufficient for most audit scenarios — the payload captures what the agent did, how much it cost, and what it produced. Gaps include:

  1. No independent verification of self-reported fields (success, lint_passed, tests_passed)
  2. Timestamp is attestation-time, not agent-completion-time (temporal accuracy gap)
  3. No model identity in jq rules (which Claude model version produced the code)
  4. No diff size or complexity metrics (how much code was generated)
  5. Cost fields (claude_cost_usd, max_cost_budget_usd) are optional in schema despite budget_exceeded being required — allows attesting budget compliance without disclosing the actual cost

Gaps

  1. HIGH: Self-reported success is the sole compliance signal. The jq rules check .success == true but this value is written by the coding agent itself. A buggy or adversarial agent can write success: true regardless of actual lint/test outcomes. Mitigation: downstream CI re-runs lint and tests, but by that point the attestation is already compliant.
  2. HIGH: lint_passed and tests_passed not enforced by jq rules. These are required schema fields but have no compliance impact. An attestation with success: true, lint_passed: false would be compliant. Add jq rules: .lint_passed == true and .tests_passed == true.
  3. HIGH: Shared KOSLI_API_TOKEN allows forged attestations. Any workflow or actor with the token can attest arbitrary payloads. No flow-scoped or trail-scoped access control.
  4. MEDIUM: Silent exit on missing result file. The step exits 0 when the result file is missing (line 384-387), meaning the workflow succeeds but the attestation is simply absent. While the trail remains non-compliant, there is no explicit failure signal in the CI run.
  5. MEDIUM: || true on agent invocation masks crashes. The coding agent can crash and still produce a partial result that gets attested. Remove || true or validate result file completeness.
  6. LOW: Timestamp mismatch. The attestation timestamp is set at payload construction time, not agent completion time. Add agent_completed_at to the payload.
  7. LOW: No model version in jq rules. The schema does not require model or model_version fields. The workflow hardcodes CODING_MODEL: claude-sonnet-4-20250514 but this is not recorded in the attestation.

Recommendations

  1. Add jq rules for lint_passed and tests_passed: .lint_passed == true and .tests_passed == true — these fields are already required by schema but have no compliance enforcement.
  2. Add result file validation: Before constructing the payload, validate that the result JSON contains all required fields with non-default values. Fail the step (non-zero exit) if validation fails.
  3. Investigate flow-scoped KOSLI_API_TOKEN or separate tokens for different flows to prevent cross-flow attestation forgery.
  4. Remove || true from agent invocation or add explicit post-run validation that the result file is complete and internally consistent.
  5. Add model_version to required schema fields and jq rules to establish model provenance.
  6. Record actual cost values as required fields (not optional) to support cost auditing.
  7. Add a checksum to the uploaded artifact to detect tampering during cross-job transfer.

Verdict: NEEDS IMPROVEMENT

The attestation covers the right conceptual ground — it gates on agent success, budget compliance, commit provenance, and file modification. The schema design is solid. However, the self-reported nature of the evidence, the gap between schema requirements and jq enforcement (lint_passed/tests_passed), and the shared API token create meaningful compliance and security risks. The downstream CI checks partially compensate, but an auditor would flag the self-attestation pattern and the unenforced schema fields as control weaknesses requiring remediation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    attestation-auditAttestation evidence audit reviewbuild-flowBuild flow attestationpriority:highCompliance-critical control

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions