Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Issue #444 — Review Integrity + Merge Guard (Implementation Plan)

This document is the implementation companion for GitHub issue #444:
"SPEC: Built-in Review Integrity + Merge Guard (Single-Shot Delivery)".

## Objective

Ship default-on review integrity and hard merge guard behavior for AO-managed PRs, with an auditable resolution workflow and explicit merge blockers.

## Scope (Single-Shot)

- Add review integrity domain contracts and evaluation engine in `@composio/ao-core`.
- Extend GitHub SCM plugin to fetch review thread snapshots and publish required check-runs.
- Add web APIs for propose -> verify -> apply resolution workflow.
- Enforce merge guard in `POST /api/prs/[id]/merge` with machine-readable blockers.
- Surface integrity/guard status in dashboard PR/session views.

## Domain Model

### ReviewThreadSnapshot

- `prNumber: number`
- `threadId: string`
- `source: "human" | "bugbot" | "other"`
- `path?: string`
- `bodyHash: string`
- `severity: "high" | "medium" | "low" | "unknown"`
- `status: "open" | "resolved"`
- `capturedAt: Date`

### ResolutionRecord

- `prNumber: number`
- `threadId: string`
- `resolutionType: "fixed" | "already_fixed" | "not_actionable" | "duplicate"`
- `actorType: "agent" | "human"`
- `actorId: string`
- `fixCommitSha?: string`
- `evidence: { changedFiles: string[]; testCommands: string[]; testResults: string[] }`
- `rationale?: string`
- `verificationStatus: "pending" | "pass" | "fail"`
- `createdAt: Date`

## Policy Rules

- `fixed` requires reachable `fixCommitSha` and non-empty verification evidence.
- `already_fixed` requires referenced commit that predates resolution action.
- `not_actionable` and `duplicate` require non-empty rationale.
- Any unresolved thread or unverified resolved thread yields review integrity failure.

## Merge Guard Contract

`allowMerge = true` only if all are true:

- Review integrity status is pass.
- Unresolved thread count is zero.
- Every resolved thread has verified resolution evidence/rationale.
- Required CI checks are passing.

Otherwise:

- `ao/merge-guard` must report failure.
- Merge API must return `422` with structured blockers.

## API Additions

- `GET /api/prs/[id]/review-threads`
- `POST /api/prs/[id]/review-resolutions`
- `POST /api/prs/[id]/review-resolutions/verify`
- `POST /api/prs/[id]/review-resolutions/apply`
- `POST /api/prs/[id]/merge` (extended guard enforcement)

## Persistence and Auditability

Use append-only key-value records in AO project data directory (same style as feedback reports) to store immutable resolution actions and verification transitions.

## Rollout Defaults

- `reviewIntegrity.enabled = true`
- `reviewIntegrity.mode = "enforce"`
- `reviewIntegrity.requireEvidenceForBotThreads = true`
- `mergeGuard.enabled = true`
- `mergeGuard.mode = "enforce"`
- `mergeGuard.requiredChecks = ["review-integrity", "ao/merge-guard"]`
- `mergeGuard.reverifyOnNewCommits = true`
- `mergeGuard.allowBypass = false`

## Test Matrix

### Unit

- Rule matrix for review integrity evaluator.
- Decision matrix for merge guard evaluator.

### Integration

- Propose/verify/apply API flow validation.
- Merge API blocker behavior (`422`) with machine-readable errors.
- Check-run publication behavior for `review-integrity` and `ao/merge-guard`.

### E2E

- Resolve without evidence is blocked.
- Valid fix with green checks is mergeable.
- New commit or new review activity invalidates previous verification.

## Delivery Notes

- No warn-only or observe-only mode in this delivery.
- No bypass path for AO-managed merges.
- This plan exists to anchor implementation sequencing and review discussion in PR.
130 changes: 130 additions & 0 deletions packages/core/src/__tests__/review-integrity.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
import { describe, expect, it } from "vitest";
import {
createResolutionRecord,
evaluateMergeGuard,
evaluateReviewIntegrity,
type ReviewThreadSnapshot,
type ResolutionRecord,
} from "../review-integrity.js";

function thread(overrides: Partial<ReviewThreadSnapshot>): ReviewThreadSnapshot {
return {
prNumber: 42,
threadId: "THR_1",
source: "human",
bodyHash: "abc123",
severity: "medium",
status: "resolved",
capturedAt: new Date("2026-03-12T00:00:00Z"),
...overrides,
};
}

function verifiedRecord(overrides: Partial<ResolutionRecord> = {}): ResolutionRecord {
return {
id: "resolution_1",
createdAt: new Date("2026-03-12T00:00:00Z"),
...createResolutionRecord({
prNumber: 42,
threadId: "THR_1",
resolutionType: "fixed",
actorType: "agent",
actorId: "ao",
fixCommitSha: "abc",
evidence: {
changedFiles: ["src/a.ts"],
testCommands: ["pnpm test"],
testResults: ["pass"],
},
}),
verificationStatus: "pass",
...overrides,
};
}

describe("review integrity evaluator", () => {
it("fails when unresolved threads exist", () => {
const result = evaluateReviewIntegrity([thread({ status: "open" })], new Map());
expect(result.status).toBe("fail");
expect(result.unresolvedThreadCount).toBe(1);
expect(result.blockers[0]?.code).toBe("THREAD_UNRESOLVED");
});

it("fails when resolved thread has no resolution record", () => {
const result = evaluateReviewIntegrity([thread({ status: "resolved" })], new Map());
expect(result.status).toBe("fail");
expect(result.unverifiedResolvedThreadCount).toBe(1);
expect(result.blockers[0]?.code).toBe("MISSING_RESOLUTION");
});

it("fails when fixed record has no test evidence", () => {
const record = verifiedRecord({
evidence: {
changedFiles: ["src/a.ts"],
testCommands: ["pnpm test"],
testResults: [],
},
});
const records = new Map([["THR_1", record]]);
const result = evaluateReviewIntegrity([thread({ status: "resolved" })], records);
expect(result.status).toBe("fail");
expect(result.blockers.some((b) => b.code === "INVALID_RESOLUTION")).toBe(true);
});

it("passes when all resolved threads are verified and valid", () => {
const records = new Map([["THR_1", verifiedRecord()]]);
const result = evaluateReviewIntegrity([thread({ status: "resolved" })], records);
expect(result.status).toBe("pass");
expect(result.blockers).toHaveLength(0);
});

it("fails on verification drift when head sha changed", () => {
const records = new Map([
[
"THR_1",
verifiedRecord({
verifiedHeadSha: "oldsha",
}),
],
]);
const result = evaluateReviewIntegrity([thread({ status: "resolved" })], records, {
currentHeadSha: "newsha",
});
expect(result.status).toBe("fail");
expect(result.blockers.some((b) => b.code === "VERIFICATION_DRIFT")).toBe(true);
});
});

describe("merge guard evaluator", () => {
it("fails when required checks are missing or failing", () => {
const integrity = evaluateReviewIntegrity([thread({ status: "resolved" })], new Map());
const result = evaluateMergeGuard({
integrity,
requiredChecks: ["review-integrity", "ao/merge-guard", "test"],
checkConclusions: new Map([
["review-integrity", "failed"],
["ao/merge-guard", "passed"],
]),
});
expect(result.allowMerge).toBe(false);
expect(result.blockers.some((b) => b.code === "REQUIRED_CHECK_NOT_PASSING")).toBe(true);
expect(result.blockers.some((b) => b.code === "REQUIRED_CHECK_MISSING")).toBe(true);
});

it("passes when integrity passes and required checks pass", () => {
const integrity = evaluateReviewIntegrity(
[thread({ status: "resolved" })],
new Map([["THR_1", verifiedRecord()]]),
);
const result = evaluateMergeGuard({
integrity,
requiredChecks: ["review-integrity", "ao/merge-guard"],
checkConclusions: new Map([
["review-integrity", "passed"],
["ao/merge-guard", "passed"],
]),
});
expect(result.allowMerge).toBe(true);
expect(result.blockers).toHaveLength(0);
});
});
25 changes: 25 additions & 0 deletions packages/core/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ export {
getSessionsDir,
getWorktreesDir,
getFeedbackReportsDir,
getReviewIntegrityDir,
getObservabilityBaseDir,
getArchiveDir,
getOriginFilePath,
Expand All @@ -156,6 +157,30 @@ export {
validateAndStoreOrigin,
} from "./paths.js";

export {
ReviewResolutionStore,
createResolutionRecord,
validateResolutionRecord,
evaluateReviewIntegrity,
evaluateMergeGuard,
} from "./review-integrity.js";
export type {
ReviewThreadSource,
ReviewThreadSeverity,
ReviewThreadStatus,
ReviewThreadSnapshot,
ResolutionType,
ResolutionActorType,
VerificationStatus,
ResolutionEvidence,
ResolutionRecord,
ReviewIntegrityEvaluation,
MergeGuardEvaluation,
IntegrityBlocker,
MergeGuardInput,
ResolutionValidationOptions,
} from "./review-integrity.js";

// Config generator — auto-generate config from repo URL
export {
isRepoUrl,
Expand Down
4 changes: 4 additions & 0 deletions packages/core/src/paths.ts
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,10 @@ export function getFeedbackReportsDir(configPath: string, projectPath: string):
return join(getProjectBaseDir(configPath, projectPath), "feedback-reports");
}

export function getReviewIntegrityDir(configPath: string, projectPath: string): string {
return join(getProjectBaseDir(configPath, projectPath), "review-integrity");
}

/**
* Get the archive directory for a project.
* Format: ~/.agent-orchestrator/{hash}-{projectId}/archive
Expand Down
Loading
Loading