Skip to content

Commit 27d96a4

Browse files
committed
feat: risk based testing
1 parent 0c6f851 commit 27d96a4

File tree

4 files changed

+414
-42
lines changed

4 files changed

+414
-42
lines changed

roadmap.md

Lines changed: 207 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
# Agentic SDLC Spec Kit Improvement Plan
2+
3+
## Cross-Reference Analysis Summary
4+
5+
**Documentation Coverage:** All 12 factors from manifesto.md are addressed across the documentation suite:
6+
- **manifesto.md**: Comprehensive 12-factor methodology with rationale
7+
- **principles.md**: Concise factor summaries
8+
- **platform.md**: Technology stack and component architecture
9+
- **playbook.md**: Detailed step-by-step implementation guide
10+
- **workflow.md**: 4-stage process workflow
11+
- **repository.md**: team-ai-directives governance and structure
12+
13+
**Implementation Gap Analysis:** Current spec-kit implements ~40-45% of documented capabilities (4-5/9 basic features actually implemented). Key gaps:
14+
- **Async execution infrastructure** (worktrees, MCP dispatching, registries)
15+
- **Advanced quality gates** (differentiated SYNC/ASYNC reviews)
16+
- **Workflow orchestration** (stage management, validation, progress tracking)
17+
- **MCP server integration** (orchestration hub, agent coordination)
18+
- **Comprehensive evaluation frameworks** (quantitative metrics, A/B testing)
19+
- **Guild infrastructure** (membership management, forum integration)
20+
21+
## Completed Items (Actually Implemented)
22+
23+
### CLI Orange Theme Restoration
24+
- Centralized the orange color palette via `ACCENT_COLOR` and `BANNER_COLORS` constants in `src/specify_cli/__init__.py` (primary accent `#f47721`).
25+
- Audited banners, prompts, and progress trackers to ensure they consume the shared constants instead of ad-hoc Rich styles.
26+
- Updated release automation so packaged command sets inherit the refreshed palette; documented override guidance in `docs/quickstart.md`.
27+
28+
### Central LLM Gateway (Golden Path)
29+
- `specify init` scaffolds `.specify/config/gateway.env`, supports `--gateway-url`/`--gateway-token`, and allows intentional suppression of warnings when no proxy is desired.
30+
- Shared bash helpers load the config, export assistant-specific base URLs, and surface warnings when the config is absent.
31+
- Lays groundwork for future gateway health checks.
32+
33+
### Context Readiness & Spec Discipline
34+
- `/specify`, `/plan`, `/tasks`, and `/implement` now enforce `context.md` completeness with gating logic and clear readiness checks.
35+
36+
### Local Team Directives Reference Support
37+
- `specify init --team-ai-directives` records local paths without cloning (the previous singular flag now aliases to this canonical option); remote URLs continue cloning into `.specify/memory`.
38+
- Common scripts resolve `.specify/config/team_directives.path`, fall back to defaults, and warn when paths are unavailable.
39+
40+
### Risk-to-Test Automation
41+
- **IMPLEMENTED**: Enhanced risk extraction in check-prerequisites.sh with standardized severity levels (Critical/High/Medium/Low).
42+
- Created generate-risk-tests.sh script to generate targeted test tasks based on risk severity and category.
43+
- Integrated with /tasks command via --include-risk-tests flag to append risk-based test tasks to tasks.md.
44+
- `/implement` captures test evidence before polish tasks conclude, keeping risk mitigation actionable.
45+
46+
### Issue Tracker MCP Integration
47+
- **NOT IMPLEMENTED**: No `--issue-tracker` argument in `specify init` command.
48+
- No MCP configuration scaffolding for Jira, Linear, GitHub Issues, and GitLab Issues.
49+
50+
### Team Directives Layout Awareness
51+
- **NOT IMPLEMENTED**: No structural scans of team-ai-directives repositories in CLI code.
52+
53+
### Knowledge Evals & Guild Feedback Loop (Basic)
54+
- **NOT IMPLEMENTED**: No evaluation manifests or guild-log.md handling in levelup scripts.
55+
56+
### Async Execution Infrastructure
57+
- **NOT IMPLEMENTED**: No `manage-tasks.sh` script for task metadata management.
58+
- No `tasks_meta.json` tracking, git worktree provisioning, or async dispatching.
59+
60+
## Prioritized Improvement Roadmap (Based on principles.md Order)
61+
62+
### HIGH PRIORITY - Foundational Principles (II, IV, V, VI, VII, VIII)
63+
64+
#### Constitution Assembly Process (Factor II: Context Scaffolding)
65+
- Implement automated constitution assembly from team-ai-directives imports
66+
- Add project-specific principle overlay system for constitution customization
67+
- Create constitution validation against imported foundational directives
68+
- Develop constitution evolution tracking with amendment history
69+
- Integrate context engineering patterns (Write, Select, Compress, Isolate) to optimize AI agent context windows and prevent hallucinations, poisoning, distraction, confusion, and clash
70+
- Incorporate actionable tips for AI-assisted coding: include error logs, design docs, database schemas, and PR feedback in context management
71+
- Use modern tools like Cursor and Cline for automatic context optimization in the SDLC workflow
72+
73+
#### Triage Skill Development Framework (Factor IV: Structured Planning)
74+
- Add explicit triage guidance and decision frameworks in plan templates
75+
- Implement triage training modules and decision trees for [SYNC] vs [ASYNC] selection
76+
- Create triage audit trails and rationale documentation
77+
- Develop triage effectiveness metrics and improvement tracking
78+
79+
#### Async Execution & Quality Gates (Factor V: Dual Execution Loops)
80+
- Introduce `tasks_meta.json` to pair with `tasks.md` and track execution metadata, reviewer checkpoints, worktree aliases, and PR links
81+
- Implement dual async execution modes:
82+
- **Local Mode**: `/implement` provisions per-task git worktrees (opt-in) for isolated development environments
83+
- **Remote Mode**: Add `specify init` arguments to integrate with async coding agents (Jules, Async Copilot, Async Codex, etc.) via MCP endpoints
84+
- `/implement` dispatches `[ASYNC]` tasks via MCP endpoints or IDE callbacks while logging job IDs
85+
- Add lightweight registries to surface async job status, architect reviews, and implementer checkpoints in CLI dashboards
86+
- Enforce micro-review on `[SYNC]` tasks and macro-review sign-off before marking `[ASYNC]` tasks as complete
87+
- Add optional helpers for branch/PR generation and cleanup after merges to streamline human review loops
88+
89+
#### Enhanced Dual Execution Loop Guidance (Factor V: Dual Execution Loops)
90+
- Update `/tasks` template to provide explicit criteria for marking tasks as [SYNC] vs [ASYNC]:
91+
- [SYNC] for: complex logic, architectural decisions, security-critical code, ambiguous requirements
92+
- [ASYNC] for: well-defined CRUD operations, repetitive tasks, clear specifications, independent components
93+
- Add decision framework in plan.md template for triage guidance
94+
95+
#### Micro-Review Enforcement for SYNC Tasks (Factor VI: The Great Filter)
96+
- Enhance `/implement` to require explicit micro-review confirmation for each [SYNC] task before marking complete
97+
- Add micro-review checklist template with criteria: correctness, architecture alignment, security, code quality
98+
- Integrate micro-review status into tasks_meta.json tracking
99+
100+
#### Differentiated Quality Gates (Factor VII: Adaptive Quality Gates)
101+
- Implement separate quality gate templates for [SYNC] vs [ASYNC] workflows:
102+
- [SYNC]: Focus on architecture review, security assessment, code quality metrics
103+
- [ASYNC]: Focus on automated testing, integration validation, performance benchmarks
104+
- Update checklist templates to reflect workflow-appropriate quality criteria
105+
106+
#### Enhanced Risk-Based Testing Framework (Factor VIII: AI-Augmented Testing)
107+
- Expand risk extraction to include severity levels (Critical/High/Medium/Low)
108+
- Add test case templates specifically designed for each risk type
109+
- Implement risk-to-test mapping with automated test generation suggestions
110+
- Add risk mitigation tracking in tasks_meta.json
111+
112+
#### Workflow Stage Orchestration (Addresses workflow.md 4-stage process)
113+
- Implement explicit 4-stage workflow management and validation (Stage 0-4 from workflow.md)
114+
- Add stage transition controls and prerequisite checking
115+
- Create workflow progress visualization and milestone tracking
116+
- Develop stage-specific guidance and best practice enforcement
117+
- Implement workflow rollback and recovery mechanisms
118+
119+
### MEDIUM PRIORITY - Integration & Governance (IX, X, XI, XII)
120+
121+
#### Issue Tracker Enhancement (Factor IX: Traceability)
122+
- Add `--issue-tracker` argument to `specify init` command to inject MCP configuration for popular issue trackers (Jira, Linear, GitHub Issues, GitLab Issues) with guided setup for API tokens, endpoints, and project identifiers
123+
124+
#### Traceability Enhancements (Factor IX: Traceability)
125+
- Implement automated trace linking between:
126+
- Issue tracker tickets ↔ spec.md ↔ plan.md ↔ tasks.md ↔ commits/PRs
127+
- AI interactions ↔ code changes ↔ review feedback
128+
- Add trace validation in quality gates to ensure complete audit trails
129+
130+
#### Strategic Tooling Improvements (Factor X: Strategic Tooling)
131+
- Add tool performance monitoring and recommendation system
132+
- Implement cost tracking and optimization suggestions for AI usage
133+
- Enhance gateway health checks with failover and load balancing
134+
- Add tool selection guidance based on task complexity and type
135+
136+
#### Structured Evaluation and Learning Framework (Factor XII: Team Capability)
137+
- Enhance `/levelup` with standardized evaluation manifests including:
138+
- Success metrics (completion time, defect rates, user satisfaction)
139+
- Process effectiveness scores
140+
- AI tool performance ratings
141+
- Lesson learned categorization
142+
- Implement quantitative evaluation framework for comparing prompt/tool effectiveness
143+
- Add automated evaluation report generation for team retrospectives
144+
145+
#### IDE Integration and Cockpit Features (Addresses platform.md IDE cockpit)
146+
- Enhance IDE integration with native command palette support
147+
- Create visual workflow stage indicators and progress tracking
148+
- Implement IDE-specific context injection and prompt optimization
149+
- Add real-time collaboration features for pair programming
150+
- Develop IDE plugin ecosystem for extended functionality
151+
152+
### LOW PRIORITY - Advanced Infrastructure (Addresses platform.md, repository.md advanced features)
153+
154+
#### MCP Server and Orchestration Hub (Addresses platform.md orchestration hub)
155+
- Implement full MCP (Model Context Protocol) server infrastructure
156+
- Create orchestration hub for coordinating multiple AI agents and tools
157+
- Add agent capability negotiation and dynamic task routing
158+
- Develop centralized orchestration dashboard for workflow monitoring
159+
- Implement MCP-based tool chaining and context sharing
160+
161+
#### MCP Server Integration (Addresses platform.md MCP server)
162+
- Implement MCP (Model Context Protocol) server for autonomous agent orchestration
163+
- Add MCP endpoint management for async task delegation
164+
- Create MCP-based agent discovery and capability negotiation
165+
- Develop MCP server health monitoring and failover systems
166+
167+
#### Autonomous Agents Framework (Addresses platform.md autonomous agents)
168+
- Build autonomous agent registration and discovery system
169+
- Create agent capability profiles and specialization tracking
170+
- Implement agent workload balancing and failover mechanisms
171+
- Add agent performance monitoring and optimization
172+
- Develop agent collaboration protocols for complex task decomposition
173+
174+
#### Comprehensive Evaluation Suite (Evals) (Factor XII: Team Capability)
175+
- Implement versioned evaluation manifests with standardized metrics
176+
- Add prompt effectiveness scoring and A/B testing frameworks
177+
- Create tool performance benchmarking and comparison systems
178+
- Develop evaluation result aggregation and trend analysis
179+
180+
#### Enhanced Traceability Framework (Factor IX: Traceability)
181+
- Implement structured trace capture for all AI interactions and decisions
182+
- Add automated trace linking between business requirements and implementation artifacts
183+
- Create trace validation in quality gates to ensure complete audit trails
184+
- Develop trace visualization and analysis tools for process improvement
185+
186+
#### Repository Governance Automation (Addresses repository.md governance)
187+
- Automate PR creation and review workflows for team-ai-directives
188+
- Implement governance rule validation and compliance checking
189+
- Create automated version management for directive libraries
190+
- Add contribution workflow optimization and review assignment
191+
- Develop governance metrics and compliance reporting
192+
193+
#### AI Development Guild Infrastructure (Addresses repository.md guild)
194+
- Build guild membership management and contribution tracking
195+
- Create guild forum integration within the development workflow
196+
- Implement guild-driven decision making and consensus processes
197+
- Add guild knowledge sharing and best practice dissemination
198+
- Develop guild performance metrics and improvement initiatives
199+
200+
## Notes
201+
- **Documentation Coverage**: All 12 manifesto factors are comprehensively documented across the MD files
202+
- **Implementation Status**: ~40-45% of basic features implemented (4-5/9 actually working), major gaps remain in advanced workflow orchestration
203+
- **Verification**: Completed items verified against actual spec-kit codebase; most "completed" items were not implemented
204+
- **Priority Alignment**: Focus on implementing core workflow orchestration features (async execution, quality gates, stage management)
205+
- **Cross-References**: All improvement suggestions are mapped to specific manifesto factors and documentation sections
206+
- IDE/tooling checks and workspace scaffolding remain handled by `specify_cli init`.
207+
- Gateway and issue-tracker integrations stay optional: they activate only when configuration is provided, preserving flexibility for teams without central infrastructure.

scripts/bash/check-prerequisites.sh

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,21 @@ path = Path(sys.argv[1])
9696
pattern = re.compile(r"^-\s*RISK:\s*(.+)$", re.IGNORECASE)
9797
risks = []
9898
99+
def normalize_severity(value):
100+
"""Normalize severity/impact to standard levels."""
101+
if not value:
102+
return "Medium"
103+
value = value.lower().strip()
104+
if value in ["critical", "crit", "high", "hi"]:
105+
return "Critical" if value.startswith("crit") else "High"
106+
elif value in ["medium", "med"]:
107+
return "Medium"
108+
elif value in ["low", "lo"]:
109+
return "Low"
110+
else:
111+
# Try to map numeric or other values
112+
return "Medium"
113+
99114
for line in path.read_text().splitlines():
100115
match = pattern.match(line.strip())
101116
if not match:
@@ -123,6 +138,9 @@ for line in path.read_text().splitlines():
123138
if data:
124139
if "id" not in data:
125140
data["id"] = f"missing-id-{len(risks)+1}"
141+
# Normalize severity from impact or severity field
142+
severity = data.get("severity") or data.get("impact")
143+
data["severity"] = normalize_severity(severity)
126144
risks.append(data)
127145
128146
print(json.dumps(risks, ensure_ascii=False))

0 commit comments

Comments
 (0)