You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add GitHub fallback for skill loading, rebase on dev
- load-skill.ts: try local plugin cache first, then fetch from GitHub
- Re-wire skill loading after rebase (review-prompt.ts removed by PR #59)
- Inject skill into candidates and validator prompt templates
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Copy file name to clipboardExpand all lines: README.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -182,6 +182,7 @@ Additional checks for this codebase:
182
182
```
183
183
184
184
These guidelines are automatically loaded and injected into all review prompts (code review, security review, and validation passes). No workflow changes needed.
185
+
185
186
## Security Skills
186
187
187
188
The security review uses specialized Factory skills installed from the public `Factory-AI/skills` repository:
@@ -30,7 +31,7 @@ export function generateReviewCandidatesPrompt(
30
31
return`You are a senior staff software engineer and expert code reviewer.
31
32
32
33
Your task: Review PR #${prNumber} in ${repoFullName} and generate a JSON file with **high-confidence, actionable** review comments that pinpoint genuine issues.
33
-
34
+
${formatSkillSection(context.reviewSkillContent)}
34
35
<context>
35
36
Repo: ${repoFullName}
36
37
PR Number: ${prNumber}
@@ -54,20 +55,7 @@ Precomputed data files:
54
55
<review_guidelines>
55
56
- You are currently checked out to the PR branch.
56
57
- Review ALL modified files in the PR branch.
57
-
- Focus on: functional correctness, syntax errors, logic bugs, broken dependencies/contracts/tests, security issues, and performance problems.
58
-
- High-signal bug patterns to actively check for (only comment when evidenced in the diff):
59
-
- Null/undefined/Optional dereferences; missing-key errors on untrusted/external dict/JSON payloads
60
-
- Resource leaks (unclosed files/streams/connections; missing cleanup on error paths)
- Do NOT duplicate comments already in \`${commentsPath}\`.
69
-
- Only flag issues you are confident about—avoid speculative or stylistic nitpicks.
70
-
- **Confidence calibration:** For each finding, honestly assess how certain you are. Mark findings as P0 only if you are virtually certain of a crash/exploit. Mark as P1 for high-confidence correctness/security issues. Use P2 for findings where the bug is plausible but you cannot fully verify the trigger path from the available context. This severity rating will be used downstream for filtering.
71
59
</review_guidelines>
72
60
73
61
<triage_phase>
@@ -194,21 +182,9 @@ Write output to \`${reviewCandidatesPath}\` using this exact schema:
@@ -33,7 +34,7 @@ export function generateReviewValidatorPrompt(
33
34
return`You are validating candidate review comments for PR #${prNumber} in ${repoFullName}.
34
35
35
36
IMPORTANT: This is Phase 2 (validator) of a two-pass review pipeline.
36
-
37
+
${formatSkillSection(context.reviewSkillContent)}
37
38
### Context
38
39
39
40
* Repo: ${repoFullName}
@@ -91,44 +92,11 @@ Read:
91
92
92
93
## Phase 2: Validate candidates
93
94
94
-
Apply the same Reporting Gate as review:
95
-
96
-
### Approve ONLY if at least one is true
97
-
* Definite runtime failure
98
-
* Incorrect logic with a concrete trigger path and wrong outcome
99
-
* Security vulnerability with realistic exploit
100
-
* Data corruption/loss
101
-
* Breaking contract change (discoverable in code/tests)
95
+
Apply the Reporting Gate, confidence calibration, and deduplication rules from the review methodology above.
102
96
103
-
Reject if ANY of these are true:
104
-
* It's speculative / "might" without a concrete trigger
105
-
* It's stylistic / naming / formatting
97
+
Additionally reject if:
106
98
* It's not anchored to a valid changed line
107
-
* It's already reported (dedupe against existing comments)
108
-
* The anchor (path/side/line/startLine) would need to change to make the suggestion work — reject instead
109
-
* It flags missing error handling / try-catch for a code path that won't crash in practice (e.g., the caller already handles the error, or the input is validated upstream)
110
-
* It describes a hypothetical race condition or timing issue without identifying the specific concurrent access pattern that triggers it
111
-
* It's about code that appears in the diff but is not part of the PR's primary change — e.g., adjacent functions, unrelated files in a multi-subsystem PR, or code from a different PR's changes that happen to be visible in context
112
-
113
-
### Confidence-based filtering
114
-
115
-
Pay attention to the candidate's priority level:
116
-
- **P0 findings**: Approve if the trigger path checks out. These should be definite crashes/exploits.
117
-
- **P1 findings**: Approve if you can verify the logic error or security issue is real.
118
-
- **P2 findings**: Reject by default. Only approve a P2 finding if ALL of these are true: (1) you can independently verify the bug exists by examining the code, (2) the bug has a concrete trigger that a user or caller could realistically hit, and (3) the finding is NOT about edge cases, defensive coding, or style. When in doubt about a P2, reject it.
119
-
120
-
### Deduplication (STRICT)
121
-
122
-
Before approving a candidate, check for duplicates:
123
-
1. **Among candidates**: If two or more candidates describe the same underlying bug (same root cause, even if anchored to different lines or worded differently), approve only the ONE with the best anchor and clearest explanation. Reject the rest with reason "duplicate of candidate N".
124
-
2. **Against existing comments**: If a candidate repeats an issue already covered by an existing PR comment (from \`${commentsPath}\`), reject it with reason "already reported in existing comments".
125
-
3. Same file + overlapping line range + same issue = duplicate, even if the body text differs.
126
-
127
-
Suggestion block rules (minimal):
128
-
* Preserve exact leading whitespace and keep blocks ≤ 100 lines
129
-
* Use RIGHT-side anchors only; do not include removed/LEFT-side lines
130
-
* For insert-only suggestions, repeat the anchor line unchanged, then append new lines
131
-
* Do not change the anchor fields (path/side/line/startLine) from the candidate — only edit the body
99
+
* It's already reported (dedupe against existing comments in \`${commentsPath}\`)
0 commit comments