Skip to content

Commit bcc5b7a

Browse files
feat: load review methodology from factory-mono builtin skill
Load the review skill's shared methodology from factory-mono's builtin-skills/review/SKILL.md instead of keeping it inline in the CI prompt templates. The skill is loaded at runtime via local plugin cache or GitHub fallback. The shared methodology (bug patterns, reporting gate, confidence calibration, deduplication, analysis discipline) is extracted via BEGIN_SHARED_METHODOLOGY / END_SHARED_METHODOLOGY markers and injected into both candidate and validator prompts. Suggestion block rules remain controlled by the include_suggestions toggle at the CI template level. Depends on: Factory-AI/factory-mono#11498 Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
1 parent db4bdca commit bcc5b7a

File tree

11 files changed

+304
-26
lines changed

11 files changed

+304
-26
lines changed

src/create-prompt/index.ts

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,7 @@ export type PromptCreationOptions = {
305305
reviewArtifacts?: ReviewArtifacts;
306306
outputFilePath?: string;
307307
includeSuggestions?: boolean;
308+
reviewSkillContent?: string;
308309
};
309310

310311
export async function createPrompt({
@@ -320,6 +321,7 @@ export async function createPrompt({
320321
reviewArtifacts,
321322
outputFilePath,
322323
includeSuggestions,
324+
reviewSkillContent,
323325
}: PromptCreationOptions) {
324326
try {
325327
const droidCommentId = commentId?.toString();
@@ -340,6 +342,10 @@ export async function createPrompt({
340342
preparedContext.includeSuggestions = includeSuggestions;
341343
}
342344

345+
if (reviewSkillContent) {
346+
preparedContext.reviewSkillContent = reviewSkillContent;
347+
}
348+
343349
await mkdir(`${process.env.RUNNER_TEMP || "/tmp"}/droid-prompts`, {
344350
recursive: true,
345351
});

src/create-prompt/templates/review-candidates-prompt.ts

Lines changed: 2 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
import { formatSkillSection } from "../../utils/load-skill";
12
import type { PreparedContext } from "../types";
23

34
export function generateReviewCandidatesPrompt(
@@ -52,7 +53,7 @@ export function generateReviewCandidatesPrompt(
5253
return `You are a senior staff software engineer and expert code reviewer.
5354
5455
Your task: Review PR #${prNumber} in ${repoFullName} and generate a JSON file with **high-confidence, actionable** review comments that pinpoint genuine issues.
55-
56+
${formatSkillSection(context.reviewSkillContent)}
5657
<context>
5758
Repo: ${repoFullName}
5859
PR Number: ${prNumber}
@@ -76,19 +77,7 @@ Precomputed data files:
7677
<review_guidelines>
7778
- You are currently checked out to the PR branch.
7879
- Review ALL modified files in the PR branch.
79-
- Focus on: functional correctness, syntax errors, logic bugs, broken dependencies/contracts/tests, security issues, and performance problems.
80-
- High-signal bug patterns to actively check for (only comment when evidenced in the diff):
81-
- Null/undefined/Optional dereferences; missing-key errors on untrusted/external dict/JSON payloads
82-
- Resource leaks (unclosed files/streams/connections; missing cleanup on error paths)
83-
- Injection vulnerabilities (SQL injection, XSS, command/template injection) and auth/security invariant violations
84-
- OAuth/CSRF invariants: state must be per-flow unpredictable and validated; avoid deterministic/predictable state or missing state checks
85-
- Concurrency/race/atomicity hazards (TOCTOU, lost updates, unsafe shared state, process/thread lifecycle bugs)
86-
- Missing error handling for critical operations (network, persistence, auth, migrations, external APIs)
87-
- Wrong-variable/shadowing mistakes; contract mismatches (serializer/validated_data, interfaces/abstract methods)
88-
- Type-assumption bugs (e.g., numeric ops on datetime/strings, ordering key type mismatches)
89-
- Offset/cursor/pagination semantic mismatches (off-by-one, prev/next behavior, commit semantics)
9080
- Do NOT duplicate comments already in \`${commentsPath}\`.
91-
- Only flag issues you are confident about—avoid speculative or stylistic nitpicks.
9281
</review_guidelines>
9382
9483
<triage_phase>

src/create-prompt/templates/review-validator-prompt.ts

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
import { formatSkillSection } from "../../utils/load-skill";
12
import type { PreparedContext } from "../types";
23

34
export function generateReviewValidatorPrompt(
@@ -43,7 +44,7 @@ export function generateReviewValidatorPrompt(
4344
return `You are validating candidate review comments for PR #${prNumber} in ${repoFullName}.
4445
4546
IMPORTANT: This is Phase 2 (validator) of a two-pass review pipeline.
46-
47+
${formatSkillSection(context.reviewSkillContent)}
4748
### Context
4849
4950
* Repo: ${repoFullName}
@@ -101,20 +102,11 @@ Read:
101102
102103
## Phase 2: Validate candidates
103104
104-
Apply the same Reporting Gate as review:
105-
106-
### Approve ONLY if at least one is true
107-
* Definite runtime failure
108-
* Incorrect logic with a concrete trigger path and wrong outcome
109-
* Security vulnerability with realistic exploit
110-
* Data corruption/loss
111-
* Breaking contract change (discoverable in code/tests)
105+
Apply the Reporting Gate, confidence calibration, and deduplication rules from the review methodology above.
112106
113-
Reject if:
114-
* It's speculative / "might" without a concrete trigger
115-
* It's stylistic / naming / formatting
107+
Additionally reject if:
116108
* It's not anchored to a valid changed line
117-
* It's already reported (dedupe against existing comments)
109+
* It's already reported (dedupe against existing comments in \`${commentsPath}\`)
118110
119111
### Deduplication (STRICT)
120112

src/create-prompt/types.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,4 +119,5 @@ export type PreparedContext = CommonFields & {
119119
reviewArtifacts?: ReviewArtifacts;
120120
outputFilePath?: string;
121121
includeSuggestions?: boolean;
122+
reviewSkillContent?: string;
122123
};

src/entrypoints/generate-review-prompt.ts

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ import { prepareMcpTools } from "../mcp/install-mcp-server";
1515
import { generateReviewCandidatesPrompt } from "../create-prompt/templates/review-candidates-prompt";
1616
import { generateSecurityReviewPrompt } from "../create-prompt/templates/security-review-prompt";
1717
import { normalizeDroidArgs, parseAllowedTools } from "../utils/parse-tools";
18+
import { loadSkill } from "../utils/load-skill";
1819

1920
async function run() {
2021
try {
@@ -98,6 +99,9 @@ async function run() {
9899
const outputFilePath = process.env.DROID_OUTPUT_FILE || undefined;
99100
const includeSuggestions = process.env.INCLUDE_SUGGESTIONS !== "false";
100101

102+
const reviewSkillContent =
103+
reviewType === "code" ? await loadSkill("review") : undefined;
104+
101105
await createPrompt({
102106
githubContext: context,
103107
commentId,
@@ -110,6 +114,7 @@ async function run() {
110114
reviewArtifacts,
111115
outputFilePath,
112116
includeSuggestions,
117+
reviewSkillContent,
113118
});
114119

115120
// Set run type

src/tag/commands/review-validator.ts

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ import { prepareMcpTools } from "../../mcp/install-mcp-server";
99
import { normalizeDroidArgs, parseAllowedTools } from "../../utils/parse-tools";
1010
import type { PrepareResult } from "../../prepare/types";
1111
import { generateReviewValidatorPrompt } from "../../create-prompt/templates/review-validator-prompt";
12+
import { loadSkill } from "../../utils/load-skill";
1213

1314
export async function prepareReviewValidatorMode({
1415
context,
@@ -46,6 +47,7 @@ export async function prepareReviewValidatorMode({
4647
};
4748

4849
const includeSuggestions = process.env.INCLUDE_SUGGESTIONS !== "false";
50+
const reviewSkillContent = await loadSkill("review");
4951

5052
await createPrompt({
5153
githubContext: context,
@@ -59,6 +61,7 @@ export async function prepareReviewValidatorMode({
5961
generatePrompt: generateReviewValidatorPrompt,
6062
reviewArtifacts,
6163
includeSuggestions,
64+
reviewSkillContent,
6265
});
6366

6467
core.exportVariable("DROID_EXEC_RUN_TYPE", "droid-review");

src/tag/commands/review.ts

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ import { createInitialComment } from "../../github/operations/comments/create-in
99
import { normalizeDroidArgs, parseAllowedTools } from "../../utils/parse-tools";
1010
import { isEntityContext } from "../../github/context";
1111
import { generateReviewCandidatesPrompt } from "../../create-prompt/templates/review-candidates-prompt";
12+
import { loadSkill } from "../../utils/load-skill";
1213
import type { Octokits } from "../../github/api/client";
1314
import type { PrepareResult } from "../../prepare/types";
1415

@@ -85,6 +86,7 @@ export async function prepareReviewMode({
8586
});
8687

8788
const includeSuggestions = process.env.INCLUDE_SUGGESTIONS !== "false";
89+
const reviewSkillContent = await loadSkill("review");
8890

8991
await createPrompt({
9092
githubContext: context,
@@ -98,6 +100,7 @@ export async function prepareReviewMode({
98100
generatePrompt: generateReviewCandidatesPrompt,
99101
reviewArtifacts,
100102
includeSuggestions,
103+
reviewSkillContent,
101104
});
102105
core.exportVariable("DROID_EXEC_RUN_TYPE", "droid-review");
103106

src/utils/load-skill.ts

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
import { readFile, readdir } from "fs/promises";
2+
import { homedir } from "os";
3+
import { resolve, join } from "path";
4+
5+
const MONO_REPO = "Factory-AI/factory-mono";
6+
const MONO_BRANCH = "feat/review-builtin-skill";
7+
8+
const SHARED_BEGIN = "<!-- BEGIN_SHARED_METHODOLOGY -->";
9+
const SHARED_END = "<!-- END_SHARED_METHODOLOGY -->";
10+
11+
/**
12+
* Format skill content for inclusion in a CI prompt.
13+
* Extracts only the shared methodology (between markers) so CI-specific
14+
* instructions in the template remain authoritative for execution behavior.
15+
*/
16+
export function formatSkillSection(skillContent: string | undefined): string {
17+
if (!skillContent) return "";
18+
const methodology = extractSharedMethodology(skillContent);
19+
return `
20+
<code_review_methodology>
21+
${methodology}
22+
</code_review_methodology>
23+
`;
24+
}
25+
26+
/**
27+
* Extract the shared methodology section from a skill's content.
28+
* Looks for BEGIN_SHARED_METHODOLOGY / END_SHARED_METHODOLOGY markers.
29+
* Returns the full content if markers are not found.
30+
*/
31+
export function extractSharedMethodology(content: string): string {
32+
const beginIdx = content.indexOf(SHARED_BEGIN);
33+
const endIdx = content.indexOf(SHARED_END);
34+
if (beginIdx === -1 || endIdx === -1 || endIdx <= beginIdx) {
35+
return content;
36+
}
37+
return content.slice(beginIdx + SHARED_BEGIN.length, endIdx).trim();
38+
}
39+
40+
/**
41+
* Load a skill from the local core plugin cache.
42+
* The Droid CLI installs the core plugin to:
43+
* ~/.factory/plugins/cache/factory-plugins/core/<hash>/skills/<name>/SKILL.md
44+
*/
45+
async function loadSkillFromCache(
46+
skillName: string,
47+
): Promise<string | undefined> {
48+
const home = process.env.HOME || homedir();
49+
const cacheDir = resolve(home, ".factory/plugins/cache/factory-plugins/core");
50+
51+
let entries: string[];
52+
try {
53+
entries = await readdir(cacheDir);
54+
} catch {
55+
return undefined;
56+
}
57+
58+
for (const hash of entries) {
59+
const skillPath = join(cacheDir, hash, "skills", skillName, "SKILL.md");
60+
try {
61+
const content = await readFile(skillPath, "utf8");
62+
const trimmed = content.trim();
63+
if (!trimmed) continue;
64+
console.log(
65+
`Loaded skill ${skillName} from ${skillPath} (${trimmed.length} bytes)`,
66+
);
67+
return trimmed;
68+
} catch {
69+
continue;
70+
}
71+
}
72+
73+
return undefined;
74+
}
75+
76+
/**
77+
* Fetch a skill from the factory-mono GitHub repo.
78+
* Used as fallback when the local plugin cache is not available (e.g. CI).
79+
*/
80+
async function loadSkillFromGitHub(
81+
skillName: string,
82+
): Promise<string | undefined> {
83+
const url = `https://raw.githubusercontent.com/${MONO_REPO}/${MONO_BRANCH}/apps/cli/builtin-skills/${skillName}/SKILL.md`;
84+
try {
85+
const response = await fetch(url);
86+
if (!response.ok) return undefined;
87+
const content = await response.text();
88+
const trimmed = content.trim();
89+
if (!trimmed) return undefined;
90+
console.log(
91+
`Loaded skill ${skillName} from GitHub (${trimmed.length} bytes)`,
92+
);
93+
return trimmed;
94+
} catch {
95+
return undefined;
96+
}
97+
}
98+
99+
/**
100+
* Load a skill by name. Tries the local plugin cache first,
101+
* then falls back to fetching from the factory-mono GitHub repo.
102+
* Throws if the skill cannot be loaded from either source.
103+
*/
104+
export async function loadSkill(skillName: string): Promise<string> {
105+
const cached = await loadSkillFromCache(skillName);
106+
if (cached) return cached;
107+
108+
const remote = await loadSkillFromGitHub(skillName);
109+
if (remote) return remote;
110+
111+
throw new Error(
112+
`Required skill "${skillName}" not found in local plugin cache or on GitHub (${MONO_REPO}).`,
113+
);
114+
}

test/integration/review-flow.test.ts

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ import * as createInitial from "../../src/github/operations/comments/create-init
88
import * as mcpInstaller from "../../src/mcp/install-mcp-server";
99
import * as actorValidation from "../../src/github/validation/actor";
1010
import * as promptModule from "../../src/create-prompt";
11+
import * as skillLoader from "../../src/utils/load-skill";
1112
import * as core from "@actions/core";
1213
import * as childProcess from "node:child_process";
1314

@@ -23,6 +24,7 @@ describe("review command integration", () => {
2324
let exportVarSpy: ReturnType<typeof spyOn>;
2425
let promptSpy: ReturnType<typeof spyOn>;
2526
let execSyncSpy: ReturnType<typeof spyOn>;
27+
let loadSkillSpy: ReturnType<typeof spyOn>;
2628

2729
beforeEach(async () => {
2830
tmpDir = await mkdtemp(path.join(os.tmpdir(), "review-int-"));
@@ -39,6 +41,9 @@ describe("review command integration", () => {
3941
promptSpy = spyOn(promptModule, "createPrompt").mockResolvedValue();
4042
setOutputSpy = spyOn(core, "setOutput").mockImplementation(() => {});
4143
exportVarSpy = spyOn(core, "exportVariable").mockImplementation(() => {});
44+
loadSkillSpy = spyOn(skillLoader, "loadSkill").mockResolvedValue(
45+
"mock skill content",
46+
);
4247

4348
execSyncSpy = spyOn(childProcess, "execSync").mockImplementation(((
4449
cmd: string,
@@ -60,6 +65,7 @@ describe("review command integration", () => {
6065
setOutputSpy.mockRestore();
6166
exportVarSpy.mockRestore();
6267
execSyncSpy.mockRestore();
68+
loadSkillSpy.mockRestore();
6369

6470
if (process.env.RUNNER_TEMP) {
6571
await rm(process.env.RUNNER_TEMP, { recursive: true, force: true });

test/modes/tag/review-command.test.ts

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import { createMockContext } from "../../mockContext";
66
import * as promptModule from "../../../src/create-prompt";
77
import * as mcpInstaller from "../../../src/mcp/install-mcp-server";
88
import * as comments from "../../../src/github/operations/comments/create-initial";
9+
import * as skillLoader from "../../../src/utils/load-skill";
910
import * as childProcess from "child_process";
1011
import * as fsPromises from "fs/promises";
1112

@@ -39,12 +40,16 @@ describe("prepareReviewMode", () => {
3940
let execSyncSpy: ReturnType<typeof spyOn>;
4041
let writeFileSpy: ReturnType<typeof spyOn>;
4142
let mkdirSpy: ReturnType<typeof spyOn>;
43+
let loadSkillSpy: ReturnType<typeof spyOn>;
4244

4345
beforeEach(() => {
4446
process.env.DROID_ARGS = "";
4547
delete process.env.REVIEW_MODEL;
4648
process.env.RUNNER_TEMP = "/tmp/test-runner";
4749

50+
loadSkillSpy = spyOn(skillLoader, "loadSkill").mockResolvedValue(
51+
"mock skill content",
52+
);
4853
promptSpy = spyOn(promptModule, "createPrompt").mockResolvedValue();
4954
mcpSpy = spyOn(mcpInstaller, "prepareMcpTools").mockResolvedValue(
5055
"mock-config",
@@ -84,6 +89,7 @@ describe("prepareReviewMode", () => {
8489
execSyncSpy.mockRestore();
8590
writeFileSpy.mockRestore();
8691
mkdirSpy.mockRestore();
92+
loadSkillSpy.mockRestore();
8793
process.env.DROID_ARGS = originalArgs;
8894
if (originalReviewModel !== undefined) {
8995
process.env.REVIEW_MODEL = originalReviewModel;

0 commit comments

Comments
 (0)