Skip to content

Commit 5a8cedf

Browse files
committed
docs: add research directory README with complete overview
- Summarize all 5 research documents - Explain how research was applied to Phase 1 - Document key insights and success metrics - Provide clear references and next steps
1 parent 754c5ea commit 5a8cedf

File tree

1 file changed

+330
-0
lines changed
  • docs/research/reverse-engineer-prompts

1 file changed

+330
-0
lines changed
Lines changed: 330 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,330 @@
1+
# Research: Reverse Engineering & Codebase Analysis Patterns
2+
3+
**Last Updated:** 2025-01-21
4+
**Status:** Research Complete - Implementation Phase 1 Complete
5+
6+
---
7+
8+
## Overview
9+
10+
This directory contains research and analysis conducted to improve our MCP spec-driven development prompts. The research synthesizes patterns from:
11+
12+
1. **Claude Code feature-dev plugin** - Production-tested 7-phase workflow
13+
2. **Existing research files** - code-analyst, information-analyst, context_bootstrap patterns
14+
3. **Best practices** - Evidence-based analysis, confidence assessment, interactive questioning
15+
16+
**Primary Goal:** Enhance prompts with battle-tested patterns for better feature development outcomes.
17+
18+
---
19+
20+
## Research Documents
21+
22+
### 1. Claude Code Feature-Dev Comparison
23+
**File:** [`claude-code-feature-dev-comparison.md`](./claude-code-feature-dev-comparison.md)
24+
**Size:** 18,287 words
25+
**Purpose:** Comprehensive analysis of Claude Code's feature-dev plugin
26+
27+
**Contents:**
28+
- Complete 7-phase workflow breakdown
29+
- Agent specifications (code-explorer, code-architect, code-reviewer)
30+
- Comparison with our current MCP prompts
31+
- Gap analysis with priority ratings (Critical/Important/Minor)
32+
- Implementation roadmap (3 sprints)
33+
- Updated workflow diagrams
34+
- Detailed recommendations
35+
36+
**Key Findings:**
37+
- ❌ Missing mandatory clarifying questions phase
38+
- ❌ No architecture options comparison
39+
- ❌ No quality review before completion
40+
- ✅ Good: Document-based artifacts
41+
- ✅ Good: Explicit sequencing
42+
- ✅ Good: Comprehensive analysis
43+
44+
**Use This For:**
45+
- Understanding Claude Code's proven workflow
46+
- Identifying gaps in our current approach
47+
- Planning future enhancements
48+
- Architecture decision justification
49+
50+
---
51+
52+
### 2. Research Synthesis
53+
**File:** [`research-synthesis.md`](./research-synthesis.md)
54+
**Size:** 8,000+ words
55+
**Purpose:** Actionable integration plan combining all research sources
56+
57+
**Contents:**
58+
- Core philosophy: Code (WHAT/HOW) vs Docs (WHY) vs User (Intent)
59+
- Two-agent specialization pattern (code-analyst + information-analyst)
60+
- Manager orchestration pattern (context_bootstrap)
61+
- Comparison matrix: Our approach vs Research best practices
62+
- Actionable recommendations with priority matrix
63+
- Specific enhancements for each prompt
64+
- Implementation roadmap (3 sprints)
65+
- Success metrics
66+
67+
**Key Recommendations:**
68+
- 🔴 HIGH: Evidence citation standards (file:line, path#heading)
69+
- 🔴 HIGH: Confidence assessment (High/Medium/Low)
70+
- 🔴 HIGH: Mandatory clarifying phase in spec generation
71+
- 🔴 HIGH: Architecture options prompt (new)
72+
- 🔴 HIGH: Implementation review prompt (new)
73+
- 🟡 MEDIUM: Interactive phased questioning
74+
- 🟡 MEDIUM: ADR template creation
75+
76+
**Use This For:**
77+
- Planning specific prompt enhancements
78+
- Understanding priority of improvements
79+
- Implementation guidance with examples
80+
- Success criteria for each enhancement
81+
82+
---
83+
84+
### 3. Code Analyst Pattern
85+
**File:** [`code-analyst.md`](./code-analyst.md)
86+
**Source:** Existing research file (cataloged)
87+
**Purpose:** Specialized agent for discovering WHAT and HOW from code
88+
89+
**Responsibilities:**
90+
- Discover WHAT system does (features, workflows, business rules)
91+
- Discover HOW it's structured (architecture, patterns, communication)
92+
- Identify WHAT technologies are used
93+
94+
**Key Principles:**
95+
- Code is ground truth - report what exists
96+
- Be specific - reference exact file:line
97+
- Distinguish fact from inference
98+
- Flag feature toggles and dormant code
99+
- **Stay in lane** - don't infer WHY
100+
101+
**What NOT to include:**
102+
- ❌ Internal data models (implementation detail)
103+
- ❌ Missing/planned features (belongs in roadmap)
104+
- ❌ Code quality judgments
105+
- ❌ Specific versions (too volatile)
106+
- ❌ Testing infrastructure details
107+
108+
**Applied To:** `generate-codebase-context` Phase 3 (Code Analysis)
109+
110+
---
111+
112+
### 4. Information Analyst Pattern
113+
**File:** [`information-analyst.md`](./information-analyst.md)
114+
**Source:** Existing research file (cataloged)
115+
**Purpose:** Specialized agent for extracting WHY from documentation
116+
117+
**Primary Job:** Extract decision rationale from docs (not discoverable from code)
118+
119+
**Responsibilities:**
120+
- Discover WHY system was built this way
121+
- Extract rationale from documentation
122+
- Find decision context and trade-offs
123+
- Capture historical evolution
124+
125+
**What to Look For:**
126+
- Why was [technology X] chosen?
127+
- Why [pattern Y] over alternatives?
128+
- What constraints drove decisions?
129+
- What trade-offs were considered?
130+
131+
**Key Principles:**
132+
- Direct quotes for "why"
133+
- Source everything (path#heading)
134+
- Attach metadata (timestamps)
135+
- Flag conflicts, don't resolve
136+
- Distinguish explicit vs implicit
137+
- Focus on rationale (unique value)
138+
139+
**Applied To:** `generate-codebase-context` Phase 2 (Documentation Audit)
140+
141+
---
142+
143+
### 5. Context Bootstrap Pattern
144+
**File:** [`context_bootstrap.md`](./context_bootstrap.md)
145+
**Source:** Existing research file (cataloged)
146+
**Purpose:** Manager orchestration pattern for coordinating specialized agents
147+
148+
**Core Philosophy:**
149+
> "Code explains HOW the system currently behaves; the user supplies WHAT it is supposed to achieve and WHY choices were made."
150+
151+
**Six-Phase Workflow:**
152+
1. Analyze repository structure
153+
2. Audit existing documentation
154+
3. Deep code analysis (subprocess: Code Analyst)
155+
4. User collaboration (fill gaps, resolve conflicts)
156+
5. Draft documentation set (PRDs, ADRs, SYSTEM-OVERVIEW)
157+
6. Review with user
158+
159+
**Key Pattern:** "Keep dialog interactive. Ask focused follow-up questions instead of long questionnaires."
160+
161+
**Deliverables:**
162+
- PRDs (Product Requirements)
163+
- ADRs (Architecture Decision Records in MADR format)
164+
- SYSTEM-OVERVIEW.md
165+
- README.md updates
166+
167+
**Applied To:** Overall `generate-codebase-context` structure and phasing
168+
169+
---
170+
171+
## How Research Was Applied
172+
173+
### Phase 1 (Completed) ✅
174+
175+
**Enhanced `generate-codebase-context` Prompt:**
176+
177+
From **code-analyst.md:**
178+
- ✅ File:line evidence citations for all code findings
179+
- ✅ Confidence levels (High/Needs Validation/Unknown)
180+
- ✅ "Stay in your lane" - don't infer WHY from code
181+
- ✅ Flag feature toggles and dormant paths
182+
- ✅ Technology names only (NO versions)
183+
- ✅ Focus on working features, not missing ones
184+
185+
From **information-analyst.md:**
186+
- ✅ Documentation audit phase (scan + timestamp + inventory)
187+
- ✅ Rationale extraction with direct quotes
188+
- ✅ Source references with path#heading format
189+
- ✅ Conflict detection between docs
190+
- ✅ Distinguish explicit vs implicit knowledge
191+
192+
From **context_bootstrap.md:**
193+
- ✅ Repository structure detection (workspace/monorepo/single)
194+
- ✅ User collaboration phase (interactive, not batch)
195+
- ✅ Capture user answers as direct quotes for citation
196+
197+
From **Claude Code feature-dev:**
198+
- ✅ Essential files list with line ranges (5-10 files)
199+
- ✅ Execution path traces (step-by-step flows)
200+
- ✅ Interactive short questions (not batch questionnaires)
201+
202+
---
203+
204+
### Phase 2 (Planned for Next PR)
205+
206+
**Enhancements Planned:**
207+
208+
1. **`generate-spec` Enhancement:**
209+
- Mandatory clarifying phase (Claude Code Phase 3)
210+
- Phased interactive questioning (context_bootstrap pattern)
211+
- WHY questions (information-analyst focus)
212+
213+
2. **`generate-architecture-options` (NEW):**
214+
- Based on Claude Code code-architect agent
215+
- Generate 2-3 approaches with trade-offs
216+
- User must choose before proceeding
217+
218+
3. **`review-implementation` (NEW):**
219+
- Based on Claude Code code-reviewer agent
220+
- Multi-focus review (bugs, quality, conventions)
221+
- Confidence-based filtering (≥80%)
222+
223+
See [`../../PROGRESS.md`](../../PROGRESS.md) for detailed roadmap.
224+
225+
---
226+
227+
## Key Insights
228+
229+
### 1. Separation of Concerns
230+
**Discovery:** Code, docs, and users each provide different information
231+
232+
- **Code → WHAT + HOW:** Features, architecture, patterns (observable facts)
233+
- **Docs → WHY:** Decisions, rationale, trade-offs (recorded intent)
234+
- **User → Goals + Intent:** Purpose, value, strategic fit (current direction)
235+
236+
**Application:** Don't conflate these sources - keep them separate and clearly attributed
237+
238+
---
239+
240+
### 2. Evidence-Based Analysis
241+
**Discovery:** Every claim needs proof
242+
243+
- Code findings: `file.ts:45-67` (line ranges)
244+
- Doc findings: `doc.md#heading` (section anchors)
245+
- User input: `[User confirmed: YYYY-MM-DD]` (dated quotes)
246+
247+
**Application:** Traceability and accountability for all findings
248+
249+
---
250+
251+
### 3. Confidence Assessment
252+
**Discovery:** Distinguish facts from inferences
253+
254+
- High: Strong evidence from working code or explicit docs
255+
- Medium: Inferred from context, feature flags, implied
256+
- Low: Cannot determine, conflicts, unknowns
257+
258+
**Application:** Flag gaps explicitly rather than guessing
259+
260+
---
261+
262+
### 4. Interactive Collaboration
263+
**Discovery:** Short focused conversations > long questionnaires
264+
265+
- Ask 3-5 questions, wait for answers
266+
- Use answers to inform next round of questions
267+
- Capture direct quotes for later citation
268+
269+
**Application:** Better engagement, more thoughtful answers
270+
271+
---
272+
273+
### 5. Mandatory Checkpoints
274+
**Discovery:** Critical decisions need explicit user approval
275+
276+
- ⛔ STOP after clarifying questions (don't proceed without answers)
277+
- ⛔ STOP after architecture options (user must choose)
278+
- ⛔ STOP after implementation (user decides what to fix)
279+
280+
**Application:** User control at key decision points
281+
282+
---
283+
284+
## Success Metrics
285+
286+
### Phase 1 Metrics ✅
287+
- ✅ 100% of code findings have file:line citations
288+
- ✅ 100% of findings categorized by confidence level
289+
- ✅ Documentation audit phase included
290+
- ✅ Interactive questioning approach (3-5 questions per round)
291+
- ✅ Essential files list structure (5-10 files with ranges)
292+
- ✅ Execution path traces included in examples
293+
294+
### Phase 2 Metrics (Target)
295+
- [ ] Clarifying questions are mandatory (cannot proceed without)
296+
- [ ] Architecture options always present 2-3 approaches
297+
- [ ] User explicitly chooses architecture before tasks
298+
- [ ] Review catches common issues before PR
299+
- [ ] All prompts use consistent evidence standards
300+
301+
---
302+
303+
## References
304+
305+
### External Sources
306+
- [Claude Code Repository](https://github.com/anthropics/claude-code)
307+
- [Feature-Dev Plugin](https://github.com/anthropics/claude-code/tree/main/plugins/feature-dev)
308+
- [Feature-Dev README](https://github.com/anthropics/claude-code/blob/main/plugins/feature-dev/README.md)
309+
- [Code Explorer Agent](https://github.com/anthropics/claude-code/blob/main/plugins/feature-dev/agents/code-explorer.md)
310+
- [Code Architect Agent](https://github.com/anthropics/claude-code/blob/main/plugins/feature-dev/agents/code-architect.md)
311+
- [Code Reviewer Agent](https://github.com/anthropics/claude-code/blob/main/plugins/feature-dev/agents/code-reviewer.md)
312+
- [MADR Format](https://adr.github.io/madr/)
313+
314+
### Internal Documents
315+
- [Progress Tracking](../../PROGRESS.md)
316+
- [Main README](../../../README.md)
317+
318+
---
319+
320+
## Next Steps
321+
322+
1. **Review Phase 1 PR:** `add-reverse-engineer-codebase-prompt` branch
323+
2. **Plan Phase 2 PR:** After Phase 1 merge
324+
3. **Implement remaining enhancements:** Per roadmap in PROGRESS.md
325+
326+
---
327+
328+
**Research Status:** Complete and applied to Phase 1
329+
**Next Research:** None planned - focus on implementation
330+
**Last Updated:** 2025-01-21

0 commit comments

Comments
 (0)