Skip to content

Commit ba467f4

Browse files
scotty595claude
andcommitted
complete-visibility audit chain: every SDK action is HMAC-chained
Until now, createIntegrityAudit() was a separate opt-in wrapper that only covered explicit integrity.log() calls. That meant "tamper-evident audit" quietly excluded register(), enforce(), audit.log(), kill-switch — the actual surface of the SDK. Fix that. New: `createGovernance({ integrityAudit: { signingKey, onFailure } })` - One config flag turns on HMAC-SHA256 hash chaining for EVERY audit write the SDK makes: agent_registered, policy_evaluation, policy_evaluation_preprocess/ postprocess, audit.log() (any eventType), kill-switch events, and the new action_outcome type. - onFailure: "allow" (default) logs via onAuditError and continues; "block" rejects enforce() when the chain write fails so no gaps. - Sequence is race-free — chainLock serialises concurrent writes. - Chain state is only committed after storage confirms the write, so a rejected storage call doesn't advance the chain. New: `gov.recordOutcome(outcome)` + `runWithOutcome(gov, opts, action)` - recordOutcome writes an action_outcome event that captures what happened AFTER enforce() approved the action: success/failure, duration, tokens, output summary (or error message). - runWithOutcome wraps an async function and records success/failure automatically, with optional summarize() for output redaction before the outcome hits the audit log. - Both participate in the integrity chain when integrityAudit is on. New: `gov.integrityChain` (populated only when integrityAudit is set) - .export(filters?) returns IntegrityAuditEvent[] joined against the in-memory integrity index, ready for verifyAuditIntegrity. - .stats() returns latestSequence / latestHash / algorithm. Tests: 10 new tests (+6 end-to-end chain coverage, +4 action-recorder). - E2E: chains register + enforce + recordOutcome + audit.log + kill into one chain, exports it, and verifies with the standalone verifier. - Tamper detection: mutating an outcome event's detail breaks verify at the right position. - Fail-open and fail-closed behaviours both covered. Total: 1348 → 1358, zero regressions, lint + build clean. README updated with an honest "what gets chained / what does NOT" table so readers don't assume 'every agent action' is covered when what we mean is 'every action routed through the SDK.' Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent cd59a24 commit ba467f4

6 files changed

Lines changed: 676 additions & 29 deletions

File tree

README.md

Lines changed: 49 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -254,23 +254,54 @@ sequence number + canonicalised event body**, so any edit, deletion, or
254254
reorder-via-sequence-renumbering breaks verification. Constant-time hash
255255
comparison throughout — no timing oracle.
256256

257-
**Opt-in**, not on by default. The core `gov.enforce()` path writes audit
258-
events directly via your storage adapter. Wrap it with `createIntegrityAudit()`
259-
to start a hash-chained log, and use the standalone `verifyAuditIntegrity()`
260-
to re-verify an exported chain offline (e.g. on a separate auditor machine).
257+
**Opt-in via a single config flag.** Pass `integrityAudit: { signingKey }` to
258+
`createGovernance()` and every audit write the SDK makes is chained
259+
automatically — no separate wrapper, no ceremony:
261260

262261
```typescript
263-
import { createIntegrityAudit, verifyAuditIntegrity } from 'governance-sdk/audit-integrity';
262+
import { createGovernance, runWithOutcome } from 'governance-sdk';
263+
import { verifyAuditIntegrity } from 'governance-sdk/audit-integrity-verify';
264264

265-
const integrity = createIntegrityAudit(gov, { signingKey: process.env.AUDIT_SECRET! });
266-
await integrity.log({ agentId: 'bot', eventType: 'tool_call', outcome: 'allow', severity: 'info' });
265+
const gov = createGovernance({
266+
rules: [/* ... */],
267+
integrityAudit: {
268+
signingKey: process.env.AUDIT_SECRET!,
269+
onFailure: 'allow', // or 'block' to fail-closed on chain errors
270+
},
271+
});
272+
273+
// Every one of these is HMAC-chained:
274+
await gov.register({ name: 'sales-bot', framework: 'mastra', owner: 'team' });
275+
await gov.enforce({ agentId, action: 'tool_call', tool: 'search' });
276+
277+
// Close the decision → outcome loop with runWithOutcome():
278+
const result = await runWithOutcome(gov, { agentId, tool: 'search' }, async () => {
279+
return await searchApi.query(q);
280+
});
281+
// ↑ success (or failure, with error + duration) auto-recorded in the chain
267282

268-
// Anywhere with the chain snapshot + the shared secret:
269-
const snapshot = await integrity.export();
270-
const { valid, brokenAt, breakDetail } = await verifyAuditIntegrity(snapshot, process.env.AUDIT_SECRET!);
271-
// => { valid: false, brokenAt: 42, breakDetail: 'Hash mismatch at sequence 42: event <id> content has been modified' }
283+
// Verify the chain offline, anywhere, with just the secret:
284+
const chain = await gov.integrityChain!.export();
285+
const { valid, brokenAt, breakDetail } = await verifyAuditIntegrity(chain, process.env.AUDIT_SECRET!);
272286
```
273287

288+
**What gets chained (when `integrityAudit` is set):**
289+
290+
| Event type | Written by | What it captures |
291+
|---|---|---|
292+
| `agent_registered` | `gov.register()` | name, framework, owner, initial score |
293+
| `policy_evaluation` | `gov.enforce()` | agent, action, tool, rule matched, outcome, reason |
294+
| `policy_evaluation_preprocess` / `_postprocess` | `gov.enforcePreprocess()` / `Postprocess()` | stage-scoped enforcement result |
295+
| `action_outcome` | `gov.recordOutcome()` or `runWithOutcome()` | success / failure, duration, tokens, output summary, error |
296+
| `agent_killed` | `killSwitch.kill()` | agent, reason, killedBy |
297+
| *(caller-supplied)* | `gov.audit.log()` | anything you pass — custom LLM calls, approvals, etc. |
298+
299+
**What is NOT automatically chained:** anything you log directly via
300+
`storage.createAuditEvent()` (bypasses the chain), anything your host app
301+
does outside governance (raw `fetch()`, filesystem I/O without going through
302+
a governed tool), and anything the agent did between `enforce()` calls that
303+
didn't invoke `enforce()` or `recordOutcome()` itself.
304+
274305
**Honest caveats:**
275306

276307
- Plain HMAC chains are only tamper-evident to holders of the signing secret.
@@ -282,6 +313,10 @@ const { valid, brokenAt, breakDetail } = await verifyAuditIntegrity(snapshot, pr
282313
anchor — a chain of N events truncated to N-1 events still verifies as a
283314
consistent chain of N-1 events. The adversarial test suite documents this
284315
limitation explicitly.
316+
- `integrityAudit.onFailure: 'allow'` (default) means a storage failure
317+
creates a chain gap that `verifyAuditIntegrity` will detect; set
318+
`'block'` to reject the enforce() call instead when you can't tolerate
319+
gaps.
285320

286321
### Kill Switch
287322

@@ -667,7 +702,9 @@ governance-sdk/injection-classifier pluggable ML classifier interface
667702
governance-sdk/injection-benchmark LIB — 6.9K-sample benchmark runner
668703
669704
# Audit + identity
670-
governance-sdk/audit-integrity HMAC hash-chain verification
705+
governance-sdk/audit-integrity HMAC hash-chain primitives (createIntegrityAudit, verifyAuditIntegrity)
706+
governance-sdk/audit-integrity-verify standalone chain verifier (for offline audit)
707+
governance-sdk/action-recorder runWithOutcome() — record action success/failure into the chain
671708
governance-sdk/agent-identity agent identity tokens
672709
governance-sdk/agent-identity-ed25519 Ed25519 signing + verification
673710
governance-sdk/kill-switch priority-999 emergency halt

packages/governance/package.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,10 @@
161161
"types": "./dist/audit-integrity-verify.d.ts",
162162
"import": "./dist/audit-integrity-verify.js"
163163
},
164+
"./action-recorder": {
165+
"types": "./dist/action-recorder.d.ts",
166+
"import": "./dist/action-recorder.js"
167+
},
164168
"./plugins/mcp-trust": {
165169
"types": "./dist/plugins/mcp-trust.d.ts",
166170
"import": "./dist/plugins/mcp-trust.js"
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
import { describe, it } from "node:test";
2+
import assert from "node:assert/strict";
3+
import { createGovernance, runWithOutcome } from "./index";
4+
import { verifyAuditIntegrity } from "./audit-integrity-verify";
5+
6+
describe("runWithOutcome", () => {
7+
it("records a success outcome and returns the action's result", async () => {
8+
const gov = createGovernance({});
9+
const agent = await gov.register({ name: "x", framework: "mastra", owner: "t" });
10+
const result = await runWithOutcome(
11+
gov,
12+
{ agentId: agent.id, tool: "search" },
13+
async () => ({ hits: 3 }),
14+
);
15+
assert.deepEqual(result, { hits: 3 });
16+
17+
// Give the fire-and-forget outcome write a tick to land.
18+
await new Promise((r) => setImmediate(r));
19+
20+
const events = await gov.audit.query({ eventType: "action_outcome" });
21+
assert.ok(events.length >= 1);
22+
const latest = events[events.length - 1];
23+
assert.equal(latest.outcome, "success");
24+
assert.equal((latest.detail as { tool?: string }).tool, "search");
25+
});
26+
27+
it("records a failure outcome and re-throws the error", async () => {
28+
const gov = createGovernance({});
29+
const agent = await gov.register({ name: "y", framework: "mastra", owner: "t" });
30+
const err = new Error("timeout");
31+
await assert.rejects(
32+
runWithOutcome(
33+
gov,
34+
{ agentId: agent.id, tool: "slow_api" },
35+
async () => {
36+
throw err;
37+
},
38+
),
39+
/timeout/,
40+
);
41+
42+
await new Promise((r) => setImmediate(r));
43+
44+
const events = await gov.audit.query({ eventType: "action_outcome" });
45+
const failure = events.find((e) => e.outcome === "failure");
46+
assert.ok(failure, "failure outcome not recorded");
47+
assert.equal((failure!.detail as { error?: string }).error, "timeout");
48+
});
49+
50+
it("applies summarize() to redact output before logging", async () => {
51+
const gov = createGovernance({});
52+
const agent = await gov.register({ name: "z", framework: "mastra", owner: "t" });
53+
54+
await runWithOutcome(
55+
gov,
56+
{
57+
agentId: agent.id,
58+
tool: "sensitive_api",
59+
summarize: (r) => ({ redacted: true, size: JSON.stringify(r).length }),
60+
},
61+
async () => ({ ssn: "123-45-6789", name: "Alice" }),
62+
);
63+
await new Promise((r) => setImmediate(r));
64+
const events = await gov.audit.query({ eventType: "action_outcome" });
65+
const latest = events[events.length - 1];
66+
const output = (latest.detail as { output?: { redacted?: boolean; size?: number } }).output;
67+
assert.equal(output?.redacted, true);
68+
assert.ok(typeof output?.size === "number");
69+
});
70+
71+
it("outcome event joins the integrity chain end-to-end", async () => {
72+
const KEY = "wrap-test-key";
73+
const gov = createGovernance({ integrityAudit: { signingKey: KEY } });
74+
const agent = await gov.register({ name: "w", framework: "mastra", owner: "t" });
75+
76+
await runWithOutcome(
77+
gov,
78+
{ agentId: agent.id, tool: "search" },
79+
async () => ({ hits: 5 }),
80+
);
81+
// Await chain serialisation
82+
await new Promise((r) => setTimeout(r, 10));
83+
84+
const chain = await gov.integrityChain!.export();
85+
const hasOutcome = chain.some((e) => e.eventType === "action_outcome");
86+
assert.ok(hasOutcome, "outcome event should be in the integrity chain");
87+
const verified = await verifyAuditIntegrity(chain, KEY);
88+
assert.equal(verified.valid, true);
89+
});
90+
});
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
/**
2+
* governance-sdk — Action outcome recorder
3+
*
4+
* Thin helper that wraps an action function so its success/failure/
5+
* duration/output is automatically recorded via `gov.recordOutcome()`.
6+
* Use in framework adapters or user code to close the loop between
7+
* "enforce() approved this" and "here's what actually happened."
8+
*
9+
* When `gov.integrityChain` is configured, the recorded outcome is
10+
* HMAC-chained alongside every other SDK audit write — so the chain
11+
* covers the full decision → outcome cycle, not just decisions.
12+
*
13+
* @example
14+
* ```ts
15+
* import { runWithOutcome } from 'governance-sdk/action-recorder';
16+
*
17+
* const result = await runWithOutcome(gov, { agentId, tool: 'search' }, async () => {
18+
* return searchApi.query(q);
19+
* });
20+
* // outcome event automatically recorded (success OR failure)
21+
* ```
22+
*/
23+
24+
import type { GovernanceInstance, ActionOutcome } from "./index.js";
25+
26+
export interface RunWithOutcomeOptions {
27+
agentId: string;
28+
tool?: string;
29+
action?: string;
30+
policyRuleId?: string;
31+
/**
32+
* Optional transform for the successful result — lets you redact output
33+
* before it hits the audit log. Return `undefined` to skip output capture.
34+
*/
35+
summarize?: (result: unknown) => unknown;
36+
/** Optional extra fields to include in the outcome detail. */
37+
detail?: Record<string, unknown>;
38+
/** Token count if this was an LLM call. */
39+
tokensUsed?: number;
40+
}
41+
42+
/**
43+
* Run an async action and record its outcome to the governance audit chain.
44+
* Re-throws the underlying error after recording, so failures propagate
45+
* exactly as they would without this wrapper.
46+
*/
47+
export async function runWithOutcome<T>(
48+
governance: GovernanceInstance,
49+
options: RunWithOutcomeOptions,
50+
action: () => Promise<T>,
51+
): Promise<T> {
52+
const startedAt = Date.now();
53+
try {
54+
const result = await action();
55+
const durationMs = Date.now() - startedAt;
56+
const summary = options.summarize ? options.summarize(result) : result;
57+
const outcome: ActionOutcome = {
58+
agentId: options.agentId,
59+
tool: options.tool,
60+
action: options.action,
61+
success: true,
62+
durationMs,
63+
output: summary,
64+
policyRuleId: options.policyRuleId,
65+
tokensUsed: options.tokensUsed,
66+
detail: options.detail,
67+
};
68+
// Fire-and-forget: we don't want outcome logging failures to corrupt
69+
// the user's successful action. If integrityAudit.onFailure is set to
70+
// "block" the call will still throw inside recordOutcome — callers
71+
// concerned about that can await recordOutcome directly instead of
72+
// using this helper.
73+
governance.recordOutcome(outcome).catch(() => { /* swallowed */ });
74+
return result;
75+
} catch (err) {
76+
const durationMs = Date.now() - startedAt;
77+
const message = err instanceof Error ? err.message : String(err);
78+
const outcome: ActionOutcome = {
79+
agentId: options.agentId,
80+
tool: options.tool,
81+
action: options.action,
82+
success: false,
83+
durationMs,
84+
error: message,
85+
policyRuleId: options.policyRuleId,
86+
tokensUsed: options.tokensUsed,
87+
detail: options.detail,
88+
};
89+
governance.recordOutcome(outcome).catch(() => { /* swallowed */ });
90+
throw err;
91+
}
92+
}

0 commit comments

Comments
 (0)