Skip to content

Commit 5bf0799

Browse files
Merge pull request #6 from smartwatermelon/claude/adversarial-reviewer-v1.3.0
feat(code-critic): adversarial-reviewer v1.3.0 refinements
2 parents 81d3bef + 9a01fe5 commit 5bf0799

File tree

8 files changed

+123
-42
lines changed

8 files changed

+123
-42
lines changed

.claude-plugin/marketplace.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"name": "code-critic",
1616
"source": "./plugins/code-critic",
1717
"description": "Skeptical code review agent that assumes code is wrong until proven otherwise. Challenges architectural decisions, identifies failure modes systematically, and prioritizes long-term maintainability over validation.",
18-
"version": "1.2.0",
18+
"version": "1.3.0",
1919
"author": {
2020
"name": "Andrew Rich",
2121
"url": "https://github.com/smartwatermelon"

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ claude plugin install --all smartwatermelon-marketplace
3333

3434
### Code Critic
3535

36-
**Status**: Stable v1.2.0
36+
**Status**: Stable v1.3.0
3737
**Category**: Quality / Code Review
3838

3939
Skeptical code review agent that assumes code is wrong until proven otherwise. Unlike validation-focused reviewers, Code Critic challenges architectural decisions, identifies failure modes, and prioritizes long-term maintainability.
@@ -155,5 +155,5 @@ Inspired by the philosophy that good tools challenge us to be better developers.
155155

156156
---
157157

158-
**Latest Version**: 1.2.0
158+
**Latest Version**: 1.3.0
159159
**Last Updated**: 2026-02-11

plugins/code-critic/.claude-plugin/plugin.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"name": "code-critic",
33
"description": "Skeptical code review agent that assumes code is wrong until proven otherwise. Challenges architectural decisions, identifies failure modes systematically, and prioritizes long-term maintainability over validation.",
4-
"version": "1.2.0",
4+
"version": "1.3.0",
55
"author": {
66
"name": "Andrew Rich",
77
"email": "andrew.rich@gmail.com"

plugins/code-critic/CHANGELOG.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Changelog
2+
3+
## v1.3.0
4+
5+
- **"Lead with verdict" resolved**: Automated pipelines state
6+
verdict in Summary line; interactive reviews keep verdict at end
7+
- **Sections consolidated**: Review Context Awareness merged into
8+
Context and Scope
9+
- **Hook over-trigger guard**: Insufficient Context verdict reserved
10+
for interactive reviews; automated pipelines use partial
11+
assessment
12+
- **Skip-critic clarified**: Application-level skips distinguished
13+
from git-level bypass; skip logging added to examples
14+
- **Pre-commit hook fix**: Dead code from `set -e` + `$?` replaced
15+
with proper exit code capture
16+
- **USAGE.md Example 2**: Alternative Approaches section added for
17+
presigned URL pattern
18+
- **Changelogs extracted**: Version history moved to this file
19+
20+
## v1.2.0
21+
22+
- **Observability checklist**: New failure mode category — can you
23+
tell when this is broken in production?
24+
- **Insufficient Context verdict**: Fourth verdict option for partial
25+
reviews where context is missing
26+
- **Diff-awareness**: Guidance for adapting review depth to diffs vs.
27+
full files vs. snippets
28+
- **Output length calibration**: Concise for git hooks, thorough for
29+
interactive review
30+
- **Confidence signaling**: Distinguishes confirmed findings from
31+
pattern-based suspicions
32+
- **Domain Awareness repositioned**: Now sets the review lens before
33+
the hierarchy is applied
34+
- **Configuration/prompts/IaC domain**: New domain-specific guidance
35+
for reviewing config and prompt files
36+
- **Honest calibration restored**: "Genuinely good code is rare"
37+
qualifier reinstated
38+
- **Documentation fixes**: Examples updated to match defined response
39+
format, severity/verdict duplication removed, `--no-verify`
40+
references replaced with safe alternatives
41+
42+
## v1.1.0
43+
44+
- **Failure mode checklist**: Systematic review across concurrency,
45+
resource management, distributed systems, error handling, security,
46+
and data integrity — not just general skepticism
47+
- **Data model review**: Schema design elevated to second priority in
48+
the review hierarchy
49+
- **Severity calibration**: Clear definitions for Critical (blocks
50+
merge), Concern (should fix), and Question (needs justification)
51+
- **Verdict**: Every review ends with a clear disposition — Block,
52+
Revise, or Accept
53+
- **Follow-up protocol**: Guidance for iterative review, handling
54+
pushback, and updating assessments
55+
- **Domain awareness**: Adjusts review lens for backend, frontend,
56+
data pipelines, APIs, and tests
57+
- **Context and scope**: Explicit instructions to read surrounding
58+
code and trace data flow before forming opinions
59+
- **Honest calibration**: Good code gets recognized with the same
60+
rigor as bad code — no manufactured criticism
61+
- **Activation Criteria removed**: Usage guidance now lives in README
62+
and USAGE.md, not in the agent prompt
63+
64+
## v1.0.0
65+
66+
- Initial release: adversarial code review agent with review
67+
hierarchy, behavioral rules, and response format

plugins/code-critic/README.md

Lines changed: 13 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -121,29 +121,17 @@ Code Critic addresses concerns in this order:
121121
**Standard Review:** "Nice use of modern JS!"
122122
**Code Critic:** "This will be unreadable in 6 months. Break it into explicit steps. Cleverness is not a virtue in production code."
123123

124-
## What's New in v1.1.0
125-
126-
- **Failure mode checklist**: Systematic review across concurrency, resource management, distributed systems, error handling, security, and data integrity — not just general skepticism
127-
- **Data model review**: Schema design elevated to second priority in the review hierarchy
128-
- **Severity calibration**: Clear definitions for Critical (blocks merge), Concern (should fix), and Question (needs justification)
129-
- **Verdict**: Every review ends with a clear disposition — Block, Revise, or Accept
130-
- **Follow-up protocol**: Guidance for iterative review, handling pushback, and updating assessments
131-
- **Domain awareness**: Adjusts review lens for backend, frontend, data pipelines, APIs, and tests
132-
- **Context and scope**: Explicit instructions to read surrounding code and trace data flow before forming opinions
133-
- **Honest calibration**: Good code gets recognized with the same rigor as bad code — no manufactured criticism
134-
- **Activation Criteria removed**: Usage guidance now lives in README and USAGE.md, not in the agent prompt
135-
136-
## What's New in v1.2.0
137-
138-
- **Observability checklist**: New failure mode category — can you tell when this is broken in production?
139-
- **Insufficient Context verdict**: Fourth verdict option for partial reviews where context is missing
140-
- **Diff-awareness**: Guidance for adapting review depth to diffs vs. full files vs. snippets
141-
- **Output length calibration**: Concise for git hooks, thorough for interactive review
142-
- **Confidence signaling**: Distinguishes confirmed findings from pattern-based suspicions
143-
- **Domain Awareness repositioned**: Now sets the review lens before the hierarchy is applied
144-
- **Configuration/prompts/IaC domain**: New domain-specific guidance for reviewing config and prompt files
145-
- **Honest calibration restored**: "Genuinely good code is rare" qualifier reinstated
146-
- **Documentation fixes**: Examples updated to match defined response format, severity/verdict duplication removed, `--no-verify` references replaced with safe alternatives
124+
## Changelog
125+
126+
See [CHANGELOG.md](CHANGELOG.md) for version history.
127+
128+
### Latest: v1.3.0
129+
130+
- Resolved "lead with verdict" vs. verdict-at-end ambiguity
131+
- Merged Review Context Awareness into Context and Scope
132+
- Fixed pre-commit hook dead code in GIT_HOOKS.md examples
133+
- Extracted changelogs to CHANGELOG.md
134+
- Clarified skip-critic vs bypass distinction in GIT_HOOKS.md
147135

148136
## What Code Critic is NOT
149137

@@ -165,6 +153,8 @@ model: opus
165153

166154
To use a different model, edit `agents/adversarial-reviewer.md` and change the model field.
167155

156+
Note: The agent's Configuration/prompts/IaC domain covers prompt and config files, which includes its own prompt file — useful for self-review workflows where the agent evaluates changes to its own definition.
157+
168158
## Contributing
169159

170160
Contributions are welcome! Please:

plugins/code-critic/agents/adversarial-reviewer.md

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -116,11 +116,6 @@ Before forming opinions, understand the context:
116116
- **Identify the system boundary.** Is this code internal plumbing or a public contract? Internal code can change; public contracts cannot. Review accordingly.
117117
- **Consider the data flow.** Trace where data comes from, what transforms it, and where it goes. Most bugs live at transformation boundaries.
118118
- **Ask about what you can't see.** If the review context is insufficient to form a judgment, say so. "I can't evaluate this without seeing how X is handled" is a valid and useful review comment.
119-
120-
## Review Context Awareness
121-
122-
You may receive code as a full file, a diff, or a partial snippet. Adapt accordingly:
123-
124119
- **If reviewing a diff**: You see only changed lines with context. State what you can and cannot evaluate. Do not assume unchanged surrounding code is correct — it may be the source of the problem. Flag when a diff-only review is insufficient for safety judgment.
125120
- **If reviewing a full file**: You have more context but may lack knowledge of callers and system integration. State this.
126121
- **If context is insufficient**: Use the Insufficient Context verdict rather than guessing.
@@ -135,6 +130,7 @@ You may receive code as a full file, a diff, or a partial snippet. Adapt accordi
135130
- Calibrate honestly. If the code is good, say so and explain *why* it's good — this is just as valuable as identifying problems. Do not manufacture criticism to fill a quota. But genuinely good code is rare — most code has real problems worth discussing.
136131
- Signal confidence on findings. "This is a race condition" and "this pattern sometimes causes race conditions but I cannot confirm without seeing the thread model" are meaningfully different statements. State which one you mean.
137132
- Never say "looks good to me" or "LGTM" unless you would mass-refactor the codebase to match this pattern.
133+
- In automated pipeline context (git hooks, CI), prefer a partial assessment with stated limitations over the Insufficient Context verdict. Reserve Insufficient Context for interactive reviews where the developer can provide the missing information.
138134

139135
## Handling Follow-Up
140136

@@ -156,7 +152,7 @@ When the developer responds to your review:
156152

157153
Calibrate review length to the context:
158154

159-
- **Automated pipeline (git hooks)**: Lead with the verdict. Keep the review focused on blocking and high-priority issues. Aim for concise.
155+
- **Automated pipeline (git hooks)**: State the verdict in the Summary line (e.g., "**Block.** Summary text...") so the hook consumer sees the disposition immediately. Keep the full Verdict section at the end. Focus on blocking and high-priority issues. Aim for concise.
160156
- **Interactive review**: Provide full analysis across all hierarchy levels. Depth over brevity.
161157
- **Always**: Every section that appears should contain substance. Omit sections with nothing to say rather than writing "None."
162158

plugins/code-critic/docs/GIT_HOOKS.md

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,8 @@ if [ -z "$DIFF" ]; then
7373
exit 0
7474
fi
7575

76-
# Run review
76+
# Run review (temporarily allow non-zero exit to capture result)
77+
set +e
7778
echo "$DIFF" | claude --agent adversarial-reviewer \
7879
--no-session-persistence \
7980
-p "Review these changes before push. Focus on:
@@ -83,8 +84,8 @@ echo "$DIFF" | claude --agent adversarial-reviewer \
8384
- Maintenance burden
8485
8586
Provide specific, actionable feedback."
86-
8787
REVIEW_EXIT=$?
88+
set -e
8889

8990
if [ $REVIEW_EXIT -ne 0 ]; then
9091
echo "${RED}❌ Code Critic found issues${NC}"
@@ -125,12 +126,15 @@ echo "🔍 Reviewing staged changes..."
125126
# Get staged diff
126127
DIFF=$(git diff --cached)
127128

128-
# Run review
129+
# Run review (temporarily allow non-zero exit to capture result)
130+
set +e
129131
echo "$DIFF" | claude --agent adversarial-reviewer \
130132
--no-session-persistence \
131133
-p "Quick review of staged changes. Focus on critical issues only."
134+
REVIEW_EXIT=$?
135+
set -e
132136

133-
if [ $? -ne 0 ]; then
137+
if [ $REVIEW_EXIT -ne 0 ]; then
134138
echo "❌ Review found issues. Fix and retry, or split into smaller commits."
135139
exit 1
136140
fi
@@ -179,6 +183,8 @@ fi
179183
echo "🔒 Security-critical files detected, running adversarial review..."
180184
echo "$CRITICAL_FILES"
181185

186+
# Run review (temporarily allow non-zero exit to capture result)
187+
set +e
182188
git diff origin/main...HEAD | claude --agent adversarial-reviewer \
183189
--no-session-persistence \
184190
-p "Review these security-critical changes:
@@ -192,8 +198,10 @@ Focus on:
192198
- XSS vulnerabilities
193199
- Race conditions
194200
- Data exposure"
201+
REVIEW_EXIT=$?
202+
set -e
195203

196-
if [ $? -ne 0 ]; then
204+
if [ $REVIEW_EXIT -ne 0 ]; then
197205
echo "❌ Security review failed"
198206
exit 1
199207
fi
@@ -339,6 +347,13 @@ fi
339347

340348
### Skipping Review When Appropriate
341349

350+
> **Note**: Application-level skips (`[skip-critic]`, `SKIP_CODE_CRITIC`)
351+
> are distinct from `--no-verify`. These skip mechanisms are visible in
352+
> commit history or CI logs, making skipped reviews auditable.
353+
> `--no-verify` silently bypasses all hooks with no trace. Use
354+
> application-level skips for docs-only commits or known-safe changes;
355+
> never use `--no-verify`.
356+
342357
Allow developers to skip when justified:
343358

344359
```bash
@@ -347,12 +362,14 @@ Allow developers to skip when justified:
347362
# Check for skip marker in commit message
348363
if git log -1 --pretty=%B | grep -q '\[skip-critic\]'; then
349364
echo "⏭️ Skipping Code Critic (commit message contains [skip-critic])"
365+
echo "[$(date)] SKIP: Code Critic skipped via [skip-critic]" >> .git/code-critic-skip.log
350366
exit 0
351367
fi
352368

353369
# Check for skip marker in environment
354370
if [ "$SKIP_CODE_CRITIC" = "1" ]; then
355371
echo "⏭️ Skipping Code Critic (SKIP_CODE_CRITIC=1)"
372+
echo "[$(date)] SKIP: Code Critic skipped via SKIP_CODE_CRITIC" >> .git/code-critic-skip.log
356373
exit 0
357374
fi
358375

plugins/code-critic/docs/USAGE.md

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -106,15 +106,21 @@ presigned URLs would eliminate.
106106
reaches your storage layer.
107107
108108
### Concerns
109-
- File proxied through API server to S3 adds latency and cost.
110-
Clients should upload directly via S3 presigned URLs.
111109
- No resume capability. A 90%-complete upload that fails means the
112110
user starts over. Consider the tus protocol for resumable uploads.
113111
- File metadata stored separately from files in S3. These will
114112
drift. S3 object tags would keep them co-located.
115113
- No per-user upload rate limiting. A single user can consume all
116114
your upload bandwidth.
117115
116+
### Alternative Approaches
117+
- **Direct-to-S3 presigned URLs**: Instead of proxying uploads
118+
through your API server, generate a presigned URL and let the
119+
client upload directly to S3. This eliminates server-side
120+
bandwidth cost, reduces latency, and removes your API server
121+
as a failure point in the upload path. Your API only handles
122+
the lightweight presign request and post-upload metadata.
123+
118124
### Questions
119125
- Why synchronous upload? This blocks API workers. What's the
120126
expected file size distribution? Anything over 10MB should be
@@ -184,9 +190,14 @@ if [ -z "$FILES" ]; then
184190
fi
185191

186192
echo "Running adversarial code review..."
187-
git diff origin/main...HEAD | claude --agent adversarial-reviewer -p "Review these changes before push" --no-session-persistence -p
193+
set +e
194+
git diff origin/main...HEAD | claude --agent adversarial-reviewer \
195+
--no-session-persistence \
196+
-p "Review these changes before push"
197+
REVIEW_EXIT=$?
198+
set -e
188199

189-
if [ $? -ne 0 ]; then
200+
if [ $REVIEW_EXIT -ne 0 ]; then
190201
echo "❌ Adversarial review found critical issues"
191202
exit 1
192203
fi

0 commit comments

Comments
 (0)