complete-visibility audit chain: every SDK action is HMAC-chained

scotty595 · claude · scotty595 · commit ba467f49d4fc · 2026-04-15T12:59:46.000+01:00
Until now, createIntegrityAudit() was a separate opt-in wrapper that only
covered explicit integrity.log() calls. That meant "tamper-evident audit"
quietly excluded register(), enforce(), audit.log(), kill-switch — the
actual surface of the SDK. Fix that.

New: `createGovernance({ integrityAudit: { signingKey, onFailure } })`
- One config flag turns on HMAC-SHA256 hash chaining for EVERY audit
  write the SDK makes:
    agent_registered, policy_evaluation, policy_evaluation_preprocess/
    postprocess, audit.log() (any eventType), kill-switch events, and
    the new action_outcome type.
- onFailure: "allow" (default) logs via onAuditError and continues;
  "block" rejects enforce() when the chain write fails so no gaps.
- Sequence is race-free — chainLock serialises concurrent writes.
- Chain state is only committed after storage confirms the write, so a
  rejected storage call doesn't advance the chain.

New: `gov.recordOutcome(outcome)` + `runWithOutcome(gov, opts, action)`
- recordOutcome writes an action_outcome event that captures what
  happened AFTER enforce() approved the action: success/failure,
  duration, tokens, output summary (or error message).
- runWithOutcome wraps an async function and records success/failure
  automatically, with optional summarize() for output redaction before
  the outcome hits the audit log.
- Both participate in the integrity chain when integrityAudit is on.

New: `gov.integrityChain` (populated only when integrityAudit is set)
- .export(filters?) returns IntegrityAuditEvent[] joined against the
  in-memory integrity index, ready for verifyAuditIntegrity.
- .stats() returns latestSequence / latestHash / algorithm.

Tests: 10 new tests (+6 end-to-end chain coverage, +4 action-recorder).
- E2E: chains register + enforce + recordOutcome + audit.log + kill into
  one chain, exports it, and verifies with the standalone verifier.
- Tamper detection: mutating an outcome event's detail breaks verify at
  the right position.
- Fail-open and fail-closed behaviours both covered.

Total: 1348 → 1358, zero regressions, lint + build clean.

README updated with an honest "what gets chained / what does NOT" table
so readers don't assume 'every agent action' is covered when what we
mean is 'every action routed through the SDK.'

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/README.md b/README.md
@@ -254,23 +254,54 @@ sequence number + canonicalised event body**, so any edit, deletion, or
 reorder-via-sequence-renumbering breaks verification. Constant-time hash
 comparison throughout — no timing oracle.
 
-**Opt-in**, not on by default. The core `gov.enforce()` path writes audit
-events directly via your storage adapter. Wrap it with `createIntegrityAudit()`
-to start a hash-chained log, and use the standalone `verifyAuditIntegrity()`
-to re-verify an exported chain offline (e.g. on a separate auditor machine).
+**Opt-in via a single config flag.** Pass `integrityAudit: { signingKey }` to
+`createGovernance()` and every audit write the SDK makes is chained
+automatically — no separate wrapper, no ceremony:
 
 ```typescript
-import { createIntegrityAudit, verifyAuditIntegrity } from 'governance-sdk/audit-integrity';
+import { createGovernance, runWithOutcome } from 'governance-sdk';
+import { verifyAuditIntegrity } from 'governance-sdk/audit-integrity-verify';
 
-const integrity = createIntegrityAudit(gov, { signingKey: process.env.AUDIT_SECRET! });
-await integrity.log({ agentId: 'bot', eventType: 'tool_call', outcome: 'allow', severity: 'info' });
+const gov = createGovernance({
+  rules: [/* ... */],
+  integrityAudit: {
+    signingKey: process.env.AUDIT_SECRET!,
+    onFailure: 'allow',   // or 'block' to fail-closed on chain errors
+  },
+});
+
+// Every one of these is HMAC-chained:
+await gov.register({ name: 'sales-bot', framework: 'mastra', owner: 'team' });
+await gov.enforce({ agentId, action: 'tool_call', tool: 'search' });
+
+// Close the decision → outcome loop with runWithOutcome():
+const result = await runWithOutcome(gov, { agentId, tool: 'search' }, async () => {
+  return await searchApi.query(q);
+});
+// ↑ success (or failure, with error + duration) auto-recorded in the chain
 
-// Anywhere with the chain snapshot + the shared secret:
-const snapshot = await integrity.export();
-const { valid, brokenAt, breakDetail } = await verifyAuditIntegrity(snapshot, process.env.AUDIT_SECRET!);
-// => { valid: false, brokenAt: 42, breakDetail: 'Hash mismatch at sequence 42: event <id> content has been modified' }
+// Verify the chain offline, anywhere, with just the secret:
+const chain = await gov.integrityChain!.export();
+const { valid, brokenAt, breakDetail } = await verifyAuditIntegrity(chain, process.env.AUDIT_SECRET!);
 ```
 
+**What gets chained (when `integrityAudit` is set):**
+
+| Event type | Written by | What it captures |
+|---|---|---|
+| `agent_registered` | `gov.register()` | name, framework, owner, initial score |
+| `policy_evaluation` | `gov.enforce()` | agent, action, tool, rule matched, outcome, reason |
+| `policy_evaluation_preprocess` / `_postprocess` | `gov.enforcePreprocess()` / `Postprocess()` | stage-scoped enforcement result |
+| `action_outcome` | `gov.recordOutcome()` or `runWithOutcome()` | success / failure, duration, tokens, output summary, error |
+| `agent_killed` | `killSwitch.kill()` | agent, reason, killedBy |
+| *(caller-supplied)* | `gov.audit.log()` | anything you pass — custom LLM calls, approvals, etc. |
+
+**What is NOT automatically chained:** anything you log directly via
+`storage.createAuditEvent()` (bypasses the chain), anything your host app
+does outside governance (raw `fetch()`, filesystem I/O without going through
+a governed tool), and anything the agent did between `enforce()` calls that
+didn't invoke `enforce()` or `recordOutcome()` itself.
+
 **Honest caveats:**
 
 - Plain HMAC chains are only tamper-evident to holders of the signing secret.
@@ -282,6 +313,10 @@ const { valid, brokenAt, breakDetail } = await verifyAuditIntegrity(snapshot, pr
   anchor — a chain of N events truncated to N-1 events still verifies as a
   consistent chain of N-1 events. The adversarial test suite documents this
   limitation explicitly.
+- `integrityAudit.onFailure: 'allow'` (default) means a storage failure
+  creates a chain gap that `verifyAuditIntegrity` will detect; set
+  `'block'` to reject the enforce() call instead when you can't tolerate
+  gaps.
 
 ### Kill Switch
 
@@ -667,7 +702,9 @@ governance-sdk/injection-classifier        pluggable ML classifier interface
 governance-sdk/injection-benchmark         LIB — 6.9K-sample benchmark runner
 
 # Audit + identity
-governance-sdk/audit-integrity             HMAC hash-chain verification
+governance-sdk/audit-integrity             HMAC hash-chain primitives (createIntegrityAudit, verifyAuditIntegrity)
+governance-sdk/audit-integrity-verify      standalone chain verifier (for offline audit)
+governance-sdk/action-recorder             runWithOutcome() — record action success/failure into the chain
 governance-sdk/agent-identity              agent identity tokens
 governance-sdk/agent-identity-ed25519      Ed25519 signing + verification
 governance-sdk/kill-switch                 priority-999 emergency halt
diff --git a/packages/governance/package.json b/packages/governance/package.json
@@ -161,6 +161,10 @@
       "types": "./dist/audit-integrity-verify.d.ts",
       "import": "./dist/audit-integrity-verify.js"
     },
+    "./action-recorder": {
+      "types": "./dist/action-recorder.d.ts",
+      "import": "./dist/action-recorder.js"
+    },
     "./plugins/mcp-trust": {
       "types": "./dist/plugins/mcp-trust.d.ts",
       "import": "./dist/plugins/mcp-trust.js"
diff --git a/packages/governance/src/action-recorder.test.ts b/packages/governance/src/action-recorder.test.ts
@@ -0,0 +1,90 @@
+import { describe, it } from "node:test";
+import assert from "node:assert/strict";
+import { createGovernance, runWithOutcome } from "./index";
+import { verifyAuditIntegrity } from "./audit-integrity-verify";
+
+describe("runWithOutcome", () => {
+  it("records a success outcome and returns the action's result", async () => {
+    const gov = createGovernance({});
+    const agent = await gov.register({ name: "x", framework: "mastra", owner: "t" });
+    const result = await runWithOutcome(
+      gov,
+      { agentId: agent.id, tool: "search" },
+      async () => ({ hits: 3 }),
+    );
+    assert.deepEqual(result, { hits: 3 });
+
+    // Give the fire-and-forget outcome write a tick to land.
+    await new Promise((r) => setImmediate(r));
+
+    const events = await gov.audit.query({ eventType: "action_outcome" });
+    assert.ok(events.length >= 1);
+    const latest = events[events.length - 1];
+    assert.equal(latest.outcome, "success");
+    assert.equal((latest.detail as { tool?: string }).tool, "search");
+  });
+
+  it("records a failure outcome and re-throws the error", async () => {
+    const gov = createGovernance({});
+    const agent = await gov.register({ name: "y", framework: "mastra", owner: "t" });
+    const err = new Error("timeout");
+    await assert.rejects(
+      runWithOutcome(
+        gov,
+        { agentId: agent.id, tool: "slow_api" },
+        async () => {
+          throw err;
+        },
+      ),
+      /timeout/,
+    );
+
+    await new Promise((r) => setImmediate(r));
+
+    const events = await gov.audit.query({ eventType: "action_outcome" });
+    const failure = events.find((e) => e.outcome === "failure");
+    assert.ok(failure, "failure outcome not recorded");
+    assert.equal((failure!.detail as { error?: string }).error, "timeout");
+  });
+
+  it("applies summarize() to redact output before logging", async () => {
+    const gov = createGovernance({});
+    const agent = await gov.register({ name: "z", framework: "mastra", owner: "t" });
+
+    await runWithOutcome(
+      gov,
+      {
+        agentId: agent.id,
+        tool: "sensitive_api",
+        summarize: (r) => ({ redacted: true, size: JSON.stringify(r).length }),
+      },
+      async () => ({ ssn: "123-45-6789", name: "Alice" }),
+    );
+    await new Promise((r) => setImmediate(r));
+    const events = await gov.audit.query({ eventType: "action_outcome" });
+    const latest = events[events.length - 1];
+    const output = (latest.detail as { output?: { redacted?: boolean; size?: number } }).output;
+    assert.equal(output?.redacted, true);
+    assert.ok(typeof output?.size === "number");
+  });
+
+  it("outcome event joins the integrity chain end-to-end", async () => {
+    const KEY = "wrap-test-key";
+    const gov = createGovernance({ integrityAudit: { signingKey: KEY } });
+    const agent = await gov.register({ name: "w", framework: "mastra", owner: "t" });
+
+    await runWithOutcome(
+      gov,
+      { agentId: agent.id, tool: "search" },
+      async () => ({ hits: 5 }),
+    );
+    // Await chain serialisation
+    await new Promise((r) => setTimeout(r, 10));
+
+    const chain = await gov.integrityChain!.export();
+    const hasOutcome = chain.some((e) => e.eventType === "action_outcome");
+    assert.ok(hasOutcome, "outcome event should be in the integrity chain");
+    const verified = await verifyAuditIntegrity(chain, KEY);
+    assert.equal(verified.valid, true);
+  });
+});
diff --git a/packages/governance/src/action-recorder.ts b/packages/governance/src/action-recorder.ts
@@ -0,0 +1,92 @@
+/**
+ * governance-sdk — Action outcome recorder
+ *
+ * Thin helper that wraps an action function so its success/failure/
+ * duration/output is automatically recorded via `gov.recordOutcome()`.
+ * Use in framework adapters or user code to close the loop between
+ * "enforce() approved this" and "here's what actually happened."
+ *
+ * When `gov.integrityChain` is configured, the recorded outcome is
+ * HMAC-chained alongside every other SDK audit write — so the chain
+ * covers the full decision → outcome cycle, not just decisions.
+ *
+ * @example
+ * ```ts
+ * import { runWithOutcome } from 'governance-sdk/action-recorder';
+ *
+ * const result = await runWithOutcome(gov, { agentId, tool: 'search' }, async () => {
+ *   return searchApi.query(q);
+ * });
+ * // outcome event automatically recorded (success OR failure)
+ * ```
+ */
+
+import type { GovernanceInstance, ActionOutcome } from "./index.js";
+
+export interface RunWithOutcomeOptions {
+  agentId: string;
+  tool?: string;
+  action?: string;
+  policyRuleId?: string;
+  /**
+   * Optional transform for the successful result — lets you redact output
+   * before it hits the audit log. Return `undefined` to skip output capture.
+   */
+  summarize?: (result: unknown) => unknown;
+  /** Optional extra fields to include in the outcome detail. */
+  detail?: Record<string, unknown>;
+  /** Token count if this was an LLM call. */
+  tokensUsed?: number;
+}
+
+/**
+ * Run an async action and record its outcome to the governance audit chain.
+ * Re-throws the underlying error after recording, so failures propagate
+ * exactly as they would without this wrapper.
+ */
+export async function runWithOutcome<T>(
+  governance: GovernanceInstance,
+  options: RunWithOutcomeOptions,
+  action: () => Promise<T>,
+): Promise<T> {
+  const startedAt = Date.now();
+  try {
+    const result = await action();
+    const durationMs = Date.now() - startedAt;
+    const summary = options.summarize ? options.summarize(result) : result;
+    const outcome: ActionOutcome = {
+      agentId: options.agentId,
+      tool: options.tool,
+      action: options.action,
+      success: true,
+      durationMs,
+      output: summary,
+      policyRuleId: options.policyRuleId,
+      tokensUsed: options.tokensUsed,
+      detail: options.detail,
+    };
+    // Fire-and-forget: we don't want outcome logging failures to corrupt
+    // the user's successful action. If integrityAudit.onFailure is set to
+    // "block" the call will still throw inside recordOutcome — callers
+    // concerned about that can await recordOutcome directly instead of
+    // using this helper.
+    governance.recordOutcome(outcome).catch(() => { /* swallowed */ });
+    return result;
+  } catch (err) {
+    const durationMs = Date.now() - startedAt;
+    const message = err instanceof Error ? err.message : String(err);
+    const outcome: ActionOutcome = {
+      agentId: options.agentId,
+      tool: options.tool,
+      action: options.action,
+      success: false,
+      durationMs,
+      error: message,
+      policyRuleId: options.policyRuleId,
+      tokensUsed: options.tokensUsed,
+      detail: options.detail,
+    };
+    governance.recordOutcome(outcome).catch(() => { /* swallowed */ });
+    throw err;
+  }
+}
diff --git a/packages/governance/src/audit-chain-e2e.test.ts b/packages/governance/src/audit-chain-e2e.test.ts
diff --git a/packages/governance/src/index.ts b/packages/governance/src/index.ts