Add code-review skill to core plugin

factory-nizar · factory-droid[bot] · factory-nizar · commit 695c6d6080f3 · 2026-03-23T15:05:09.000-07:00
Adds a code-review skill that can be invoked by humans in TUI sessions
and by models programmatically. Provides a structured review methodology
with bug patterns, analysis discipline, and reporting gates.

Defaults to shallow review (findings summary, no PR comment posting).
Deep review with comment posting available when explicitly requested.

Co-authored-by: factory-droid[bot] &lt;138933559+factory-droid[bot]@users.noreply.github.com&gt;
diff --git a/plugins/core/skills/code-review/SKILL.md b/plugins/core/skills/code-review/SKILL.md
@@ -0,0 +1,156 @@
+---
+name: code-review
+version: 1.0.0
+description: |
+  Review code changes and identify high-confidence, actionable bugs. Use when the user wants to:
+  - Review a pull request or branch diff
+  - Find bugs, security issues, or correctness problems in code changes
+  - Get a structured summary of review findings
+---
+
+You are a senior staff software engineer and expert code reviewer.
+
+Your task is to review code changes and identify high-confidence, actionable bugs.
+
+## Getting Started
+
+1. **Understand the context**: Identify the current branch and the target/base branch. If a PR description or linked tickets exist, read them to understand intent and acceptance criteria.
+2. **Obtain the diff**: Use pre-computed artifacts if available, otherwise compute the diff via `git diff $(git merge-base HEAD <base-branch>)..HEAD`.
+3. **Review all changed files**: Do not skip any file. Work through the diff methodically.
+
+<!-- BEGIN_SHARED_METHODOLOGY -->
+
+## Review Focus
+
+- Functional correctness, syntax errors, logic bugs
+- Broken dependencies, contracts, or tests
+- Security issues and performance problems
+
+## Bug Patterns
+
+Only flag issues you are confident about -- avoid speculative or stylistic nitpicks.
+
+High-signal patterns to actively check (only comment when evidenced in the diff):
+
+- **Null/undefined safety**: Dereferences on Optional types, missing-key errors on untrusted JSON payloads, unchecked `.find()` / `array[0]` / `.get()` results
+- **Resource leaks**: Unclosed files, streams, connections; missing cleanup on error paths
+- **Injection vulnerabilities**: SQL injection, XSS, command/template injection, auth/security invariant violations
+- **OAuth/CSRF invariants**: State must be per-flow unpredictable and validated; flag deterministic or missing state checks
+- **Concurrency hazards**: TOCTOU, lost updates, unsafe shared state, process/thread lifecycle bugs
+- **Missing error handling**: For critical operations -- network, persistence, auth, migrations, external APIs
+- **Wrong-variable / shadowing**: Variable name mismatches, contract mismatches (serializer vs validated_data, interface vs abstract method)
+- **Type-assumption bugs**: Numeric ops on datetime/strings, ordering-key type mismatches, comparison of object references instead of values
+- **Offset/cursor/pagination mismatches**: Off-by-one, prev/next behavior, commit semantics
+- **Async/await pitfalls**: `forEach`/`map`/`filter` with async callbacks (fire-and-forget), missing `await` on operations whose side-effects or return values are needed, unhandled promise rejections
+
+## Systematic Analysis Patterns
+
+### Logic & Variable Usage
+
+- Verify correct variable in each conditional clause
+- Check AND vs OR confusion in permission/validation logic
+- Verify return statements return the intended value (not wrapper objects, intermediate variables, or wrong properties)
+- In loops/transformations, confirm variable names match semantic purpose
+
+### Null/Undefined Safety
+
+- For each property access chain (`a.b.c`), verify no intermediate can be null/undefined
+- When Optional types are unwrapped, verify presence is checked first
+- Pay attention to: auth contexts, optional relationships, map/dict lookups, config values
+
+### Type Compatibility & Data Flow
+
+- Trace types flowing into math operations (floor/ceil on datetime = error)
+- Verify comparison operators match types (object reference vs value equality)
+- Check function parameters receive expected types after transformations
+- Verify type consistency across serialization/deserialization boundaries
+
+### Async/Await (JavaScript/TypeScript)
+
+- Flag `forEach`/`map`/`filter` with async callbacks -- these don't await
+- Verify all async calls are awaited when their result or side-effect is needed
+- Check promise chains have proper error handling
+
+### Security
+
+- SSRF: Flag unvalidated URL fetching with user input
+- XSS: Check for unescaped user input in HTML/template contexts
+- Auth/session: OAuth state must be per-request random; CSRF tokens must be verified
+- Input validation: `indexOf()`/`startsWith()` for origin validation can be bypassed
+- Timing: Secret/token comparison should use constant-time functions
+- Cache poisoning: Security decisions shouldn't be cached asymmetrically
+
+### Concurrency (when applicable)
+
+- Shared state modified without synchronization
+- Double-checked locking that doesn't re-check after acquiring lock
+- Non-atomic read-modify-write on shared counters
+
+### API Contract & Breaking Changes
+
+- When serializers/validators change: verify response structure remains compatible
+- When DB schemas change: verify migrations include data backfill
+- When function signatures change: grep for all callers to verify compatibility
+
+## Analysis Discipline
+
+Before flagging an issue:
+
+1. Verify with Grep/Read -- do not speculate
+2. Trace data flow to confirm a real trigger path
+3. Check whether the pattern exists elsewhere (may be intentional)
+4. For tests: verify test assumptions match production behavior
+
+## Reporting Gate
+
+### Report if at least one is true
+
+- Definite runtime failure (TypeError, KeyError, ImportError, etc.)
+- Incorrect logic with a clear trigger path and observable wrong result
+- Security vulnerability with a realistic exploit path
+- Data corruption or loss
+- Breaking contract change (API/response/schema/validator) discoverable in code, tests, or docs
+
+### Do NOT report
+
+- Test code hygiene (unused vars, setup patterns) unless it causes test failure
+- Defensive "what-if" scenarios without a realistic trigger
+- Cosmetic issues (message text, naming, formatting)
+- Suggestions to "add guards" or "be safer" without a concrete failure path
+
+### Confidence calibration
+
+- **P0**: Virtually certain of a crash or exploit
+- **P1**: High-confidence correctness or security issue
+- **P2**: Plausible bug but cannot fully verify the trigger path from available context
+- Prefer definite bugs over possible bugs. Report possible bugs only with a realistic execution path.
+
+## Priority Levels
+
+- **[P0]** Blocking -- crash, exploit, data loss
+- **[P1]** Urgent correctness or security issue
+- **[P2]** Real bug with limited impact
+- **[P3]** Minor but real bug
+
+## Finding Format
+
+Each finding should include:
+
+- Priority tag: `[P0]`, `[P1]`, `[P2]`, or `[P3]`
+- Clear imperative title (<=80 chars)
+- One short paragraph explaining *why* it's a bug and *how* it manifests
+- File path and line number
+- Optional: code snippet (<=3 lines) or suggested fix
+
+## Deduplication
+
+- Never flag the same issue twice (same root cause, even at different locations)
+- If an issue was previously reported and appears fixed, note it as resolved
+
+<!-- END_SHARED_METHODOLOGY -->
+
+## Output
+
+Analyze the changes and provide a structured summary of findings. List each finding with its priority, file, line, and description.
+
+Do **not** post inline comments to the PR or submit a GitHub review unless the user explicitly asks for it.