[RFC]: Restructure AI-DLC workflows

### Summary

Restructure the AI-DLC workflows from a rules-only delivery model to a multi-mechanism architecture using **rules, skills, subagents, and hooks**. The core change: convert the 532-line always-loaded `core-workflow.md` into a lean AGENTS.md (~125 lines of permanent constraints) plus 16 on-demand skills (13 stage skills + 1 orchestrator skill that replaces `core-workflow.md` + 1 workflow-changes skill + 1 security-check fallback skill). For typical workflows, total main-context consumption drops 33–45% depending on the number of stages executed, with always-loaded context reduced from ~12,000 tokens to ~3,000 tokens (75% reduction at session start). The restructure also enables independent stage invocation, deterministic quality enforcement through hooks, context-isolated security review via subagents (with skill-based fallback on platforms without subagent support), and explicit mid-workflow change handling (going back, skipping, cascade re-generation).

### Motivation

### Current State: Measured Context Consumption

Today, AI-DLC is delivered entirely as **rules** (markdown files placed in platform-specific directories). The following measurements are from the current `aidlc-rules/` directory:

**Always loaded at workflow start (~12,000 tokens):**

| File | Lines | Words | Est. Tokens |
|------|------:|------:|------------:|
| `core-workflow.md` | 532 | 3,271 | ~4,250 |
| `process-overview.md` | 140 | 733 | ~950 |
| `question-format-guide.md` | 332 | 1,343 | ~1,750 |
| `session-continuity.md` | 46 | 336 | ~440 |
| `content-validation.md` | 78 | 376 | ~490 |
| `welcome-message.md` | 109 | 602 | ~780 |
| `security-baseline.md` | 323 | 2,601 | ~3,380 |
| **Total always loaded** | **1,560** | **9,262** | **~12,040** |

**Loaded on demand per stage (~18,000 tokens across 13 files):**

| Group | Lines | Words | Est. Tokens |
|-------|------:|------:|------------:|
| Inception stages (7 files) | 1,722 | 8,824 | ~11,470 |
| Construction stages (6 files) | 973 | 5,050 | ~6,570 |
| **Total on demand** | **2,695** | **13,874** | **~18,040** |

**Other common files loaded as needed (~7,400 tokens across 6 files):**

| File | Lines | Words | Est. Tokens |
|------|------:|------:|------------:|
| `error-handling.md` | 373 | 1,833 | ~2,380 |
| `workflow-changes.md` | 285 | 1,561 | ~2,030 |
| `terminology.md` | 189 | 925 | ~1,200 |
| `ascii-diagram-standards.md` | 116 | 367 | ~480 |
| `overconfidence-prevention.md` | 99 | 591 | ~770 |
| `depth-levels.md` | 73 | 382 | ~500 |
| **Total other common** | **1,135** | **5,659** | **~7,360** |

**Grand total across all 27 files: 5,409 lines, 28,880 words, ~37,500 tokens.**

The ~12,000 tokens always loaded at workflow start represent a meaningful cost. On a 200k-token context window, this is 6%. But on models with smaller context windows, or in brownfield projects where the AI also needs to hold thousands of lines of existing code in context, this overhead compounds. More critically, the bulk of those 12,000 tokens are *workflow orchestration instructions* (`core-workflow.md` at 4,250 tokens) and *question formatting rules* (`question-format-guide.md` at 1,750 tokens)—content that is only relevant during specific stages, not permanently.

### Problems with the rules-only approach

1. **Upfront context cost for stage-specific content**: 12,000 tokens are loaded before any work begins. Of those, only ~2,000 tokens (directory structure, terminology, content validation basics) are genuinely needed at all times. The remaining ~10,000 tokens are orchestration logic, question formatting, session resumption, and security rules—content needed only at specific stages.

2. **No independent stage invocation**: A user cannot say "just reverse-engineer my codebase" or "just generate requirements" without triggering the entire AI-DLC workflow. Every entry point goes through the full `core-workflow.md` orchestration.

3. **No automated enforcement**: Content validation (Mermaid syntax, ASCII diagrams), artifact existence checks, and build verification are *instructions to the AI* rather than *deterministic automated checks*. The AI may skip or forget them.

4. **No composability**: The workflow stages cannot be mixed with other tools, triggered independently, or reused as building blocks. Everything is monolithic.

5. **Context accumulation across stages**: When the AI loads `inception/reverse-engineering.md` (311 lines) during the reverse engineering stage, those instructions persist in the conversation context through subsequent stages. Over a full workflow run, on-demand files accumulate. There is no platform-supported mechanism to remove content from context once loaded (see "Skill Context Lifecycle" section below).

6. **Platform fragmentation**: The core workflow contains platform-detection logic (check 3 paths for rule details). Each platform requires different file placement, and some require special wrapping (Cursor needs YAML frontmatter).





### Detailed Proposal

### Skill Context Lifecycle: What Actually Happens

Before describing the architecture, it's important to be precise about how skills behave across platforms. The Agent Skills specification and platform implementations follow a **deferred loading** model, not a load/unload lifecycle:

1. **At session start**: Only skill `name` and `description` fields are loaded into the system prompt (~50-100 tokens per skill). This is how the agent knows what skills are available.
2. **On invocation**: When a task matches a skill (or the user explicitly triggers it), the full `SKILL.md` body is loaded into the conversation context.
3. **After completion**: The skill content **remains in the conversation context**. No platform (Claude Code, Kiro, Cursor) supports explicit skill unloading. Content persists until the context is compacted or the session ends.

**This means the primary context benefit is deferred loading, not unloading.** In the current rules-only model, 12,000 tokens load upfront. With skills, only ~980 tokens of AGENTS.md + ~1,600 tokens of 16 skill descriptions + ~400 tokens of subagent descriptions load upfront (~3,000 tokens total at session start, before any stage is invoked). Each stage's content loads only when that stage is reached.

**For stages requiring true context isolation**, Claude Code supports `context: fork` in skill frontmatter, which runs the skill in a subagent with its own context window. The skill's content never enters the main conversation context. This RFC uses this mechanism for context-heavy stages (reverse engineering, security review).

#### How skill invocation works mechanically

The orchestration map in AGENTS.md says things like "Invoke `aidlc-reverse-engineering`." This relies on a specific interaction between rules and skills:

1. **AGENTS.md is always in context** (it's a rule). The orchestration map is always visible to the agent.
2. **Skill descriptions are always indexed** at session start (~50-100 tokens each). The agent sees the names and descriptions of all available skills.
3. **When the agent follows the orchestration map** and reaches "Invoke `aidlc-reverse-engineering`", it recognizes this as a registered skill name and triggers it. The platform then loads the full SKILL.md body into context.
4. **On re-invocation** (e.g., the same skill for a second unit of work), the skill content is already in context from the first invocation—it does not load a second copy. The agent simply re-follows the procedure.

This is an **agent-directed** loading model: the AGENTS.md instructs the agent *when* to trigger each skill; the platform handles *how* (loading the body into context). This differs from purely platform-controlled loading (where the platform heuristically decides which skill matches). The orchestration map provides the deterministic sequencing; skills provide the procedural content.

For users invoking individual stages directly (e.g., `/aidlc-reverse-engineering`), the platform's standard slash-command mechanism triggers the skill without needing the orchestration map.

**Platform support for the deferred loading model:**

| Platform | Skill descriptions at startup | Full content on invocation | `context: fork` isolation |
|----------|:----:|:----:|:----:|
| Claude Code | Yes (2% context budget) | Yes | Yes |
| Kiro | Yes | Yes | No |
| Cursor | Yes | Yes | No |
| 30+ others via agentskills.io | Yes | Yes | Varies |

### New Directory Structure

```
aidlc/
├── AGENTS.md                              # Rules: always-on project constraints
├── skills/                                # Skills: stage procedures (deferred loading)
│   ├── aidlc/                             # Orchestrator: workflow entry point
│   │   └── SKILL.md
│   ├── aidlc-workspace-detection/
│   │   └── SKILL.md
│   ├── aidlc-reverse-engineering/
│   │   └── SKILL.md
│   ├── aidlc-requirements-analysis/
│   │   └── SKILL.md
│   ├── aidlc-user-stories/
│   │   └── SKILL.md
│   ├── aidlc-workflow-planning/
│   │   └── SKILL.md
│   ├── aidlc-application-design/
│   │   └── SKILL.md
│   ├── aidlc-units-generation/
│   │   └── SKILL.md
│   ├── aidlc-functional-design/
│   │   └── SKILL.md
│   ├── aidlc-nfr-requirements/
│   │   └── SKILL.md
│   ├── aidlc-nfr-design/
│   │   └── SKILL.md
│   ├── aidlc-infrastructure-design/
│   │   └── SKILL.md
│   ├── aidlc-code-generation/
│   │   └── SKILL.md
│   ├── aidlc-build-and-test/
│   │   └── SKILL.md
│   ├── aidlc-workflow-changes/            # Mid-workflow changes: go back, skip, cascade
│   │   └── SKILL.md
│   └── aidlc-security-check/             # Security fallback for non-subagent platforms
│       └── SKILL.md
├── agents/                                # Subagents: isolated context work
│   ├── aidlc-security-reviewer.md         # Cross-cutting security compliance
│   ├── aidlc-reverse-engineer.md          # Large codebase analysis
│   └── aidlc-code-reviewer.md             # Post-generation code review
└── scripts/                               # Supporting scripts for hooks + build
    ├── validate-mermaid.sh                # Mermaid syntax validation
    ├── validate-audit-format.sh           # Audit log format checker
    └── build-rules.sh                     # Backward-compat rules generation
```

### Mechanism 1: Rules (AGENTS.md) — Always-on constraints

The `AGENTS.md` replaces `core-workflow.md` + mandatory common files with a lean file containing only what the AI needs at all times. Content is organized into two categories: **project constraints** (things the AI must always obey) and **skill orchestration** (how the AI navigates between skills).

**Content inventory:**

| Section | Source | Est. Lines | Purpose |
|---------|--------|:----------:|---------|
| Adaptive workflow principle | `core-workflow.md` lines 1-9 | 10 | Core philosophy |
| Directory structure | `core-workflow.md` lines 502-532 | 20 | `aidlc-docs/` layout |
| Terminology glossary | `common/terminology.md` | 20 | Key terms (subset) |
| Overconfidence prevention | `common/overconfidence-prevention.md` | 15 | "When in doubt, ask" guidelines |
| Approval gate protocol | `core-workflow.md` lines 442-454 | 15 | Standardized 2-option messages, audit requirements |
| Content validation basics | `common/content-validation.md` | 15 | Mermaid/ASCII rules (brief; hooks enforce) |
| Workflow change trigger | `common/workflow-changes.md` | 3 | One-line trigger: "When the user requests a workflow change (go back, skip, restart, add/remove stage or unit, change depth), invoke `aidlc-workflow-changes`." Full procedures live in the `aidlc-workflow-changes` skill. |
| Error recovery basics | `common/error-handling.md` | 10 | Severity levels, general recovery principle (detail in skills) |
| Shared procedures | `common/question-format-guide.md` | 15 | Condensed question format (see "Common Content" below) |
| Session state detection | — | 3 | Always-on rule: "If `aidlc-docs/aidlc-state.md` exists, inform user and suggest `/aidlc` to resume or a specific stage skill." Prevents silent state loss on session resume. |
| **Total** | | **~125** | |

The orchestration map (workflow sequencing logic, welcome message, session resume) is **not** in AGENTS.md. It lives in the `aidlc` orchestrator skill, loaded only when the user starts a full AI-DLC workflow. AGENTS.md contains two triggers: *"When the user says 'Using AI-DLC' or invokes `/aidlc`, invoke the `aidlc` skill"* and *"When the user requests a workflow change (go back, skip, restart, add/remove stage or unit, change depth), invoke `aidlc-workflow-changes`."* AGENTS.md also contains a session state detection rule: *"If `aidlc-docs/aidlc-state.md` exists in the workspace, inform the user and suggest `/aidlc` to resume or a specific stage skill."*

**Estimated always-on context**: ~125 lines, ~750 words, **~980 tokens** of AGENTS.md + ~1,600 tokens of 16 skill descriptions (13 stages + 1 orchestrator + 1 workflow-changes + 1 security-check) + ~400 tokens of 3 subagent descriptions = **~3,000 tokens at session start** (down from ~12,000). When the user starts a full workflow, the `aidlc` orchestrator skill loads an additional ~2,130 tokens.

#### Orchestration Design: The `aidlc` Orchestrator Skill

The orchestration logic lives in a dedicated `aidlc` skill (`skills/aidlc/SKILL.md`), not in AGENTS.md. This skill replaces `core-workflow.md` as the workflow entry point. It is loaded only when the user starts a full AI-DLC workflow (by saying "Using AI-DLC, ..." or typing `/aidlc`), and contains four things:

1. **Welcome message** (~780 tokens, from current `welcome-message.md`)
2. **Session resume detection** (~440 tokens, from current `session-continuity.md` — checks for existing `aidlc-state.md` and offers to resume)
3. **Skill orchestration map with enforcement** (~710 tokens, the sequential workflow logic with health checks, post-invocation artifact verification, and cascade dependency annotations)
4. **Cross-cutting concerns** (~200 tokens, security enforcement with subagent/skill fallback, mid-workflow change delegation)

**Total orchestrator skill size: ~2,130 tokens.** This content is not in the always-on AGENTS.md. For standalone stage invocations (e.g., `/aidlc-reverse-engineering`), the orchestrator is never loaded.

The orchestration map within the skill:

```markdown
## Skill Orchestration

### Startup Health Check

Before starting the workflow, verify all required skills are registered.
Check that the following skills are accessible: aidlc-workspace-detection,
aidlc-reverse-engineering, aidlc-requirements-analysis, aidlc-user-stories,
aidlc-workflow-planning, aidlc-application-design, aidlc-units-generation,
aidlc-functional-design, aidlc-nfr-requirements, aidlc-nfr-design,
aidlc-infrastructure-design, aidlc-code-generation, aidlc-build-and-test,
aidlc-workflow-changes, aidlc-security-check.
If any skill is not recognized, halt and display:
"AIDLC ERROR: Skill '{name}' not found. Verify aidlc/skills/{name}/SKILL.md
exists and has valid YAML frontmatter."

### Workflow Execution Rules

At each step, invoke the named skill and wait for its
completion and user approval before proceeding. **After each
skill completes, verify its expected output artifacts exist
before advancing to the next step.** If expected artifacts are
missing after skill completion, re-invoke the skill or halt
with: "AIDLC ERROR: Stage '{name}' completed but expected
artifacts are missing: {list}. Re-run this stage or investigate."

### Stage Dependency Map (Cascade Rules)

When a user requests going back to a previous stage, invoke
`aidlc-workflow-changes` to handle the full cascade procedure
(impact assessment, artifact archival, state reset, re-execution
sequencing). The following dependency map defines which downstream
stages are invalidated when a stage is re-executed:

- workspace-detection → reverse-engineering, requirements-analysis
- reverse-engineering → requirements-analysis
- requirements-analysis → user-stories, workflow-planning, application-design
- user-stories → workflow-planning
- workflow-planning → all remaining stages
- application-design → units-generation, all construction stages
- units-generation → all construction stages (per unit)
- functional-design → code-generation (per unit)
- nfr-requirements → nfr-design (per unit)
- nfr-design → infrastructure-design, code-generation (per unit)
- infrastructure-design → code-generation (per unit)
- code-generation → build-and-test

### Inception Phase (WHAT and WHY)

1. **Always**: Invoke `aidlc-workspace-detection`
   - Verify: aidlc-state.md updated with workspace type
   - If result is "brownfield" and no reverse engineering artifacts exist → step 2
   - Otherwise → step 3
2. **Conditional**: Invoke `aidlc-reverse-engineering`
   - For large codebases (>500 files), prefer delegating to the
     `aidlc-reverse-engineer` subagent for context isolation
   - Verify: aidlc-docs/inception/reverse-engineering/ artifacts exist
3. **Always**: Invoke `aidlc-requirements-analysis`
   - Verify: aidlc-docs/inception/requirements/ artifacts exist
4. **Conditional**: Invoke `aidlc-user-stories`
   - Execute if: new user-facing features, multiple user types, complex business
     requirements, cross-functional collaboration
   - Skip if: pure refactoring, simple bug fixes, infrastructure-only,
     documentation-only
   - Verify (if executed): aidlc-docs/inception/user-stories/ artifacts exist
5. **Always**: Invoke `aidlc-workflow-planning`
   - This skill determines which remaining stages to execute or skip
   - User can override inclusions/exclusions
   - Verify: aidlc-docs/inception/workflow-planning/ artifacts exist
6. **Conditional**: Invoke `aidlc-application-design`
   - Execute if: new components, service layer design needed
   - Verify (if executed): aidlc-docs/inception/application-design/ artifacts exist
7. **Conditional**: Invoke `aidlc-units-generation`
   - Execute if: system needs decomposition into multiple units
   - Verify (if executed): aidlc-docs/inception/application-design/unit-of-work.md exists

### Construction Phase (HOW)

For each unit of work (from aidlc-docs/inception/application-design/unit-of-work.md),
execute stages 8-12 in sequence. Track the current unit in aidlc-state.md under
`## Current Unit`. Each skill reads the current unit name from state to determine
which unit's artifacts to produce (output path: aidlc-docs/construction/{unit-name}/).

On the first unit, skill invocations load their full content into context.
On subsequent units, the skill content is already in context — the agent
re-follows the same procedures for the next unit without reloading.

8.  **Conditional**: Invoke `aidlc-functional-design`
    - Verify (if executed): aidlc-docs/construction/{unit-name}/functional-design/ exists
9.  **Conditional**: Invoke `aidlc-nfr-requirements`
    - Verify (if executed): aidlc-docs/construction/{unit-name}/nfr-requirements/ exists
10. **Conditional**: Invoke `aidlc-nfr-design` (only if step 9 executed)
    - Verify (if executed): aidlc-docs/construction/{unit-name}/nfr-design/ exists
11. **Conditional**: Invoke `aidlc-infrastructure-design`
    - Verify (if executed): aidlc-docs/construction/{unit-name}/infrastructure-design/ exists
12. **Always**: Invoke `aidlc-code-generation`
    - Verify: generated code files exist at expected paths

After completing all stages for a unit, update aidlc-state.md to mark the unit
complete and set the next unit as current **in a single write operation** (atomic
state update to prevent inconsistency on interruption). Repeat until all units
are done.

After all units complete:

13. **Always**: Invoke `aidlc-build-and-test`
    - Verify: build succeeds, test results recorded in aidlc-state.md

### Mid-Workflow Changes

When the user requests going back to a previous stage, skipping a stage,
adding/removing stages or units, or changing depth mid-workflow: invoke
`aidlc-workflow-changes`. This skill handles cascade impact assessment,
artifact archival, state reset, and re-execution sequencing. **Do not
attempt to handle workflow changes inline** — always delegate to the skill,
which contains the full procedural guidance from the current
`workflow-changes.md` (285 lines of tested change-handling logic).

### Cross-cutting: Security

When the security extension is enabled (tracked in aidlc-state.md),
enforce security compliance at each stage completion:

1. **Try subagent first**: Delegate compliance checking to the
   `aidlc-security-reviewer` subagent. The subagent runs in its own
   context, evaluates artifacts independently, and returns a compliance
   report.
2. **Fallback to skill**: If subagent invocation fails (platform does not
   support subagents, or subagent times out / returns malformed output),
   invoke the `aidlc-security-check` skill instead. This skill contains
   the same 15 SECURITY rules and evaluation procedure, but runs in the
   main context (~3,600 tokens) rather than in isolation.
3. **Fail-closed policy**: If neither the subagent nor the fallback skill
   produces a valid compliance report, treat the result as NON-COMPLIANT.
   Display: "Security review could not be completed. Resolve the issue
   before proceeding." **Never silently skip security review.**

Non-compliance (from either mechanism) is a blocking finding. Block stage
transition until all non-compliant findings are resolved.
```

The orchestration map portion is ~105 lines and ~700 words (~910 tokens). This is larger than the original ~55-line/~450-token map because it now includes three enforcement mechanisms that were absent from the initial proposal:

1. **Startup health check** (~8 lines): Verifies all 15 skills are registered before the workflow begins, preventing silent skill-loading failures.
2. **Post-invocation artifact verification** (~1 line per step): After each skill completes, the orchestrator verifies expected output artifacts exist before proceeding. This replaces the "MANDATORY" enforcement blocks in the current `core-workflow.md` with a more targeted mechanism.
3. **Stage dependency map** (~15 lines): Explicit cascade annotations so the agent knows which downstream stages are invalidated when a stage is re-executed. This enables the `aidlc-workflow-changes` skill to compute cascade impacts.
4. **Mid-workflow change delegation** (~6 lines): Explicit instruction to invoke `aidlc-workflow-changes` for any user-requested change, rather than attempting inline handling.
5. **Security enforcement with fallback** (~15 lines): Try-subagent-first, fallback-to-skill, fail-closed policy. Ensures security enforcement is preserved on all platforms.

Combined with the welcome message and session resume logic, the full orchestrator skill is ~2,130 tokens. It captures the same adaptive workflow logic as the current 532-line `core-workflow.md` by:
- Keeping the **conditional execution criteria** as brief inline rules (2-3 lines each)
- **Delegating all procedural detail** to the individual skills
- **Replacing "MANDATORY" enforcement blocks** with targeted post-invocation verification and a fail-closed security policy
- **Delegating mid-workflow changes** to the `aidlc-workflow-changes` skill rather than embedding change-handling logic inline

The current `core-workflow.md` is long because it repeats a ~17-20 line execution block for every stage (load steps, log in audit, execute, wait for approval, log response), with some inception stages embedding larger conditional assessment criteria (up to ~70 lines for User Stories). In the skill model, each skill owns its own execution procedure, so the orchestrator focuses on sequencing, verification, and delegation.

### Mechanism 2: Skills — Stage procedures with deferred loading

Each AI-DLC stage becomes an independent skill following the [Agent Skills specification](https://agentskills.io).

**Skill YAML frontmatter structure:**

```yaml
---
name: aidlc-reverse-engineering
description: >
  Comprehensive analysis of an existing codebase. Generates business overview,
  architecture docs, code structure, API documentation, component inventory,
  technology stack, and dependencies documentation. Use when working on a
  brownfield project that hasn't been analyzed yet.
user-invocable: true
---
```

Setting `user-invocable: true` allows users to invoke any stage directly as a slash command (`/aidlc-reverse-engineering`), solving the "no independent stage invocation" problem without needing a separate commands mechanism.

**Skill body structure (SKILL.md):**

```
# [Stage Name]

## Prerequisites
- What artifacts must exist before this skill runs
- What state conditions are required

## Shared Procedures
- Reference to AGENTS.md question format (brief: "Follow the question
  format defined in AGENTS.md § Question Format")
- Reference to AGENTS.md approval protocol

## Procedure
[Full step-by-step instructions — migrated from the current detail file]

## Outputs
- List of artifacts produced with paths under aidlc-docs/
- State updates to aidlc-state.md

## Completion
[Standardized approval gate message]
```

**Skills to create (16 total — 1 orchestrator + 13 stages + 1 workflow-changes + 1 security-check fallback):**

Token estimates below include the original stage detail file content *plus* absorbed common content (error handling sections, depth level guidance, ASCII diagram standards, approval blocks). The "Absorbed from" column shows which Group 3 common files contribute content to each skill.

| Skill | Source File | Base Tokens | Absorbed from | Est. Total Tokens | `context: fork` |
|-------|------------|------------:|---------------|------------------:|:---:|
| `aidlc` (orchestrator) | `core-workflow.md` + `welcome-message.md` + `session-continuity.md` | ~2,130 | — | ~2,130 | No |
| `aidlc-workspace-detection` | `inception/workspace-detection.md` | ~520 | error-handling (~150) | ~700 | No |
| `aidlc-reverse-engineering` | `inception/reverse-engineering.md` | ~1,300 | error-handling (~150), ascii-diagrams (~250) | ~1,730 | Optional |
| `aidlc-requirements-analysis` | `inception/requirements-analysis.md` | ~1,310 | error-handling (~150), depth-levels (~500) | ~1,990 | No |
| `aidlc-user-stories` | `inception/user-stories.md` | ~2,910 | error-handling (~150) | ~3,090 | No |
| `aidlc-workflow-planning` | `inception/workflow-planning.md` | ~2,520 | error-handling (~150), depth-levels (~300), ascii-diagrams (~250) | ~3,250 | No |
| `aidlc-application-design` | `inception/application-design.md` | ~1,240 | error-handling (~150), ascii-diagrams (~250) | ~1,670 | No |
| `aidlc-units-generation` | `inception/units-generation.md` | ~1,670 | error-handling (~150) | ~1,850 | No |
| `aidlc-functional-design` | `construction/functional-design.md` | ~1,010 | error-handling (~150) | ~1,190 | No |
| `aidlc-nfr-requirements` | `construction/nfr-requirements.md` | ~870 | error-handling (~150) | ~1,050 | No |
| `aidlc-nfr-design` | `construction/nfr-design.md` | ~680 | error-handling (~150) | ~860 | No |
| `aidlc-infrastructure-design` | `construction/infrastructure-design.md` | ~740 | error-handling (~150), ascii-diagrams (~250) | ~1,170 | No |
| `aidlc-code-generation` | `construction/code-generation.md` | ~1,820 | error-handling (~200) | ~2,050 | No |
| `aidlc-build-and-test` | `construction/build-and-test.md` | ~1,460 | error-handling (~200) | ~1,690 | No |
| `aidlc-workflow-changes` | `common/workflow-changes.md` | ~2,030 | — | ~2,030 | No |
| `aidlc-security-check` | `extensions/security/baseline/security-baseline.md` | ~3,380 | evaluation procedure (~220) | ~3,600 | No |
| | | | | **~30,050** | |

**Note on the security extension**: The security baseline (323 lines, ~3,380 tokens) is a cross-cutting concern evaluated at every stage boundary. On platforms with subagent support, it moves to the `aidlc-security-reviewer` subagent (see Mechanism 3), which has its own context window and loads the security rules internally — keeping the main context clean. On platforms without subagent support (Cline), the `aidlc-security-check` fallback skill provides the same 15-rule evaluation procedure in the main context (~3,600 tokens). The orchestrator's cross-cutting security section defines the try-subagent-first, fallback-to-skill, fail-closed policy that ensures security enforcement is never silently dropped regardless of platform capabilities.

#### Common Content Distribution

Currently, shared content is delivered as 11 common rule files plus 1 extension file. The skill model must account for every one of them. The following table traces each current file to its destination:

**Group 1 files (currently always loaded):**

| Current file | Tokens | Destination | Rationale |
|-------------|-------:|-------------|-----------|
| `core-workflow.md` | ~4,250 | AGENTS.md orchestration map (~450 tokens) + skills own their procedures | Orchestration logic condensed; stage detail delegated |
| `process-overview.md` | ~950 | Dropped (redundant with orchestration map) | The map IS the process overview |
| `question-format-guide.md` | ~1,750 | AGENTS.md (~200 token summary) + embedded in skills that ask questions (~150 tokens each in requirements-analysis, user-stories, workflow-planning) | Multiple-choice format and `[Answer]:` tag condensed |
| `session-continuity.md` | ~440 | Embedded in `aidlc` orchestrator skill (resume detection at workflow start) | Orchestrator checks for aidlc-state.md and offers to resume |
| `content-validation.md` | ~490 | AGENTS.md (~200 token brief) + hooks enforce deterministically | Rules state the principle; hooks enforce it |
| `welcome-message.md` | ~780 | Embedded in `aidlc` orchestrator skill (displayed at workflow start) | Loaded only when full workflow is invoked, not always-on |
| `security-baseline.md` | ~3,380 | `aidlc-security-reviewer` subagent (embedded in full) + `aidlc-security-check` fallback skill (for non-subagent platforms) | Cross-cutting concern isolated in subagent where supported; fallback skill ensures enforcement on all platforms |

**Group 3 files (currently loaded as needed):**

| Current file | Tokens | Destination | Rationale |
|-------------|-------:|-------------|-----------|
| `error-handling.md` | ~2,380 | AGENTS.md (~130 token summary: severity levels, general recovery principle) + stage-specific recovery procedures embedded in each skill (~150-200 tokens per skill) | General principle is permanent; per-stage recovery is stage-specific |
| `workflow-changes.md` | ~2,030 | AGENTS.md (~40 token trigger) + `aidlc-workflow-changes` skill (full procedures, deferred loading) | Cross-cutting but infrequent; one-line trigger in AGENTS.md invokes the skill only when the user requests a change |
| `depth-levels.md` | ~500 | Embedded in `aidlc-requirements-analysis` (~500 tokens) and `aidlc-workflow-planning` (~300 tokens) | Only these two skills use adaptive depth selection |
| `terminology.md` | ~1,200 | AGENTS.md (~260 token subset: Phase, Stage, Unit of Work, Service, Component, Planning vs Generation) | Core terms only; full glossary available in skills |
| `overconfidence-prevention.md` | ~770 | AGENTS.md (~200 tokens condensed) | Always-relevant behavioral constraint |
| `ascii-diagram-standards.md` | ~480 | Embedded in skills that produce diagrams: reverse-engineering, application-design, workflow-planning, infrastructure-design (~250 tokens each) + hooks validate | Only 4 skills produce ASCII diagrams |

**Three-layer distribution model:**

1. **Truly permanent content → AGENTS.md**: Terminology, directory structure, approval protocol, condensed question format, workflow change trigger, error recovery basics, overconfidence prevention, session state detection, and content validation principles. These are always needed regardless of which stage is active. (~125 lines, ~980 tokens)

2. **Stage-specific content → embedded in each skill**: Error recovery procedures specific to each phase, detailed question formatting examples, depth-level guidance (requirements + planning only), ASCII diagram standards (4 diagram-producing skills only), and the approval gate message block. Each skill is self-contained and can be invoked independently. Duplication is intentional and quantified in the skills table above.

3. **Enforcement → hooks**: Mermaid syntax and ASCII diagram validation, audit format checking. The AI doesn't need to "remember" to validate—the hook runs automatically.

**Duplication budget**: Among the 13 stage skills, the absorbed common content adds ~2,100 tokens of error-handling (vs. a single 2,380-token file — net savings of ~280 tokens since each skill embeds only its relevant recovery procedures), ~1,000 tokens of ASCII standards across 4 skills (vs. a single 480-token file — +520 overhead), and ~800 tokens of depth-levels across 2 skills (vs. a single 500-token file — +300 overhead). Net duplication overhead is ~540 tokens above the original file sizes. The `aidlc-workflow-changes` and `aidlc-security-check` skills do not duplicate content — they absorb their source files wholesale (~2,030 and ~3,600 tokens respectively) and are only loaded when triggered. This is a modest cost for skill independence.

### Mechanism 3: Subagents — Context-isolated specialized work

Subagents handle tasks that require their own context window. Each subagent is defined as a markdown file with YAML frontmatter in the `agents/` directory.

| Subagent | Purpose | Context isolation needed because | When invoked |
|----------|---------|--------------------------------|--------------|
| `aidlc-security-reviewer` | Evaluates all 15 SECURITY rules against stage artifacts. Returns a compliance report with compliant/non-compliant/N/A per rule. | The security baseline is 3,380 tokens of cross-cutting rules. Loading it into the main context at every stage boundary would accumulate. Running it in a subagent keeps the main context clean. | At each stage completion when security extension is enabled |
| `aidlc-reverse-engineer` | Analyzes large codebases. Produces 8 reverse engineering artifacts (business overview, architecture, code structure, APIs, components, tech stack, dependencies, code quality). | Large codebase analysis can consume 50k+ tokens of code context. Isolating this prevents the main conversation from being overwhelmed. | When `aidlc-reverse-engineering` skill detects a large codebase (>500 files) |
| `aidlc-code-reviewer` | Reviews generated code against requirements documents, design docs, and security rules. | Independent review benefits from a fresh context that isn't anchored to the generation process. | After code generation, before build-and-test |

**Example subagent definition** (`agents/aidlc-security-reviewer.md`):

```yaml
---
name: aidlc-security-reviewer
description: >
  Evaluates AI-DLC stage artifacts against the 15 SECURITY baseline rules
  (OWASP Top 10 2025 mapped). Returns a structured compliance report.
  Invoke at each stage completion when security extension is enabled.
model: haiku
tools:
  - Read
  - Glob
  - Grep
---

# Security Baseline Reviewer

## Your Role
You are a security compliance reviewer. You evaluate stage artifacts against
the security baseline rules below and return a structured compliance report.

## Input
You will receive:
- The stage name and its output artifacts (file paths)
- The current extension configuration from aidlc-state.md

## Security Rules
[Full content of current security-baseline.md embedded here — 323 lines]

## Output Format
Return a markdown compliance report:
- For each applicable SECURITY rule: Compliant / Non-Compliant / N/A
- For Non-Compliant: specific finding and remediation guidance
- Summary: blocking findings count, N/A count, compliant count
```

Using a lightweight model (e.g., Haiku) for the security reviewer keeps cost low while the main agent continues on a more capable model.

### Mechanism 4: Hooks — Deterministic quality enforcement

Hooks replace AI instructions with automated checks. Hook configuration is **platform-specific** — there is no universal hooks file. The `build-rules.sh` script generates the appropriate configuration for each platform:

- **Claude Code**: Hooks go in `.claude/settings.json` under the `hooks` key
- **Cursor**: Hooks go in `.cursor/hooks.json`
- **Kiro**: Hooks configured via Kiro's hooks UI or config files

The validation scripts themselves (`scripts/validate-mermaid.sh`, `scripts/validate-audit-format.sh`) are platform-agnostic shell scripts. Only the hook wiring is platform-specific.

**Claude Code format** (`.claude/settings.json`):

```json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "./aidlc/scripts/validate-mermaid.sh \"$TOOL_INPUT_FILE_PATH\""
          }
        ]
      }
    ],
    "Stop": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "echo 'AI-DLC session ended. Check aidlc-docs/aidlc-state.md for last saved state (may not reflect in-progress work).'"
          }
        ]
      }
    ]
  }
}
```

**Note on file path filtering**: Claude Code's `PostToolUse` `matcher` field matches tool names (e.g., `Write|Edit`), not file paths. File path filtering (only validating files under `aidlc-docs/`) is handled inside the validation scripts themselves. `validate-mermaid.sh` checks whether `$TOOL_INPUT_FILE_PATH` matches `aidlc-docs/**/*.md` before running validation, and exits 0 (no-op) for non-matching paths. Similarly, `validate-audit-format.sh` only acts when the target file is `aidlc-docs/audit.md`.

**What moves from rules/instructions to hooks:**
- Mermaid syntax validation → `PostToolUse` hook on `Write|Edit`, script filters to `aidlc-docs/**/*.md`
- Audit format enforcement → `PostToolUse` hook on `Write|Edit`, script filters to `audit.md`
- Session end notification → `Stop` hook

**What stays as skill instructions** (no suitable hook event exists):
- Artifact existence verification before stage transition — this remains in each skill's "Prerequisites" section because no platform offers an "after skill completion" hook event
- Build and lint execution after code generation — the `aidlc-build-and-test` skill handles this procedurally

**Supporting scripts**: The `scripts/` directory contains the actual validation logic. `validate-mermaid.sh` checks the file path against `aidlc-docs/**/*.md`, then uses `mmdc --validate` (Mermaid CLI) if available, or falls back to a regex-based syntax checker for common errors (unclosed brackets, invalid diagram type keywords). `validate-audit-format.sh` checks that the last entry in `audit.md` has the required fields (Timestamp in ISO 8601, User Input, AI Response, Context). Both scripts exit 0 on success and exit 1 with an error message on failure, which the platform surfaces to the agent for correction.

### Context Budget Analysis: Before and After

#### Baseline comparison

| Metric | Current (rules-only) | Proposed | Difference |
|--------|---------------------:|---------:|-----------:|
| Always-on context at session start | ~12,040 tokens | ~3,000 tokens | **-75%** |
| After full-workflow invocation (`/aidlc`) | ~12,040 tokens | ~5,130 tokens (+ orchestrator skill) | **-57%** |
| Security rules in main context | ~3,380 tokens (always) | 0 tokens (subagent) | **-100%** |
| Total stage skill content (13 stages) | ~18,040 tokens | ~22,290 tokens | +24% (absorbed common content) |
| Orchestrator skill | (in core-workflow.md) | ~2,130 tokens | Extracted from AGENTS.md |

The total stage skill content is larger than the current stage files because common content (error handling, depth levels, ASCII standards) is now embedded in skills rather than loaded as shared files. The net duplication overhead across the 13 stage skills is ~540 tokens (see Duplication Budget above). The `aidlc-workflow-changes` (~2,030 tokens) and `aidlc-security-check` (~3,600 tokens) skills are demand-loaded and not included in the scenario analyses below unless specifically triggered.

#### Scenario-based analysis

**Scenario 1: Simple bug fix (3 stages: workspace-detection, requirements-analysis, code-generation)**

| | Current | Proposed |
|-|--------:|---------:|
| Always-on | ~12,040 | ~3,000 |
| Orchestrator | (included above) | ~2,130 |
| Stage content loaded | ~3,650 (3 detail files) | ~4,740 (3 skills with absorbed content) |
| Common files loaded | ~2,380 (error-handling) | 0 (embedded in skills) |
| Security (if disabled) | 0 | 0 |
| **Total in main context** | **~18,070** | **~9,870** |

**Savings: ~45%.** This is the primary use case where the restructure delivers the most value. Most AI-DLC invocations are partial workflows. For standalone invocations (e.g., `/aidlc-reverse-engineering` without the orchestrator), savings are even larger since the orchestrator's ~2,130 tokens are never loaded.

**Scenario 2: Greenfield single-unit project (8 stages: no reverse engineering, no NFR, no infrastructure)**

Stages: workspace-detection, requirements-analysis, user-stories, workflow-planning, application-design, units-generation, code-generation, build-and-test.

| | Current | Proposed |
|-|--------:|---------:|
| Always-on | ~12,040 | ~3,000 |
| Orchestrator | (included above) | ~2,130 |
| Stage content loaded | ~13,450 (8 detail files) | ~16,290 (8 skills with absorbed content) |
| Common files loaded | ~5,390 (error-handling, workflow-changes, depth-levels, ascii-diagrams) | 0 (embedded in skills) |
| Security (if enabled) | ~3,380 (in main context) | 0 (subagent) |
| **Total in main context** | **~34,260** | **~21,420** |

**Savings: ~37%.** The security subagent isolation accounts for a significant portion.

**Scenario 3: Brownfield 3-unit project with security (full 13 stages, construction loop runs 3 times)**

All 13 stages execute. Construction stages 8-12 run for each of 3 units. Skills loaded in the first unit iteration remain in context for subsequent units (no re-loading, no accumulation of duplicates).

| | Current | Proposed |
|-|--------:|---------:|
| Always-on | ~12,040 | ~3,000 |
| Orchestrator | (included above) | ~2,130 |
| Inception skills (stages 1-7) | ~11,470 | ~14,280 (with absorbed content) |
| Construction skills (stages 8-13, loaded once) | ~6,570 | ~8,010 (with absorbed content) |
| Common files loaded during workflow | ~7,360 (all 6 common files) | 0 (embedded) |
| Security (always on) | ~3,380 | 0 (subagent) |
| **Total in main context** | **~40,820** | **~27,420** |

**Savings: ~33%.** In the worst case (full workflow), savings come primarily from the leaner AGENTS.md (-10,840 vs always-loaded rules), security subagent isolation (-3,380), and eliminating shared file overhead (-7,360), partially offset by the orchestrator skill (+2,130) and absorbed content duplication (+540 across all skills).

**Note on multi-unit loops**: Skills invoked for unit 1 remain in context for units 2 and 3. The agent re-follows the same procedures without reloading the skill body. This means multi-unit workflows do not multiply skill context cost—the construction skills are loaded once regardless of unit count. This matches the current behavior where detail files, once loaded, also persist.

**Note on standalone invocations**: When a user invokes a single stage directly (e.g., `/aidlc-reverse-engineering`), neither the orchestrator skill nor unrelated stage skills are loaded. Context is: AGENTS.md (~980 tokens) + skill descriptions (~1,600 tokens) + subagent descriptions (~400 tokens) + the invoked skill body only. For reverse engineering: ~980 + 1,600 + 400 + 1,730 = ~4,710 tokens, vs. ~12,040 + 1,300 = ~13,340 in the current model. **Savings: ~65%.**

### Migration Path

Two phases, both committed:

**Phase 1: Skills + Rules (core value)**
1. Create `AGENTS.md` with condensed permanent constraints (~140 lines)
2. Create `aidlc` orchestrator skill with welcome message, session resume, and orchestration map
3. Convert all 13 stage detail files to skills with SKILL.md format
4. Create `aidlc-workflow-changes` skill with full cascade/skip/restart procedures
5. Create `aidlc-security-check` fallback skill for non-subagent platforms
6. Create `aidlc-security-reviewer` subagent with embedded security rules
7. Create `build-rules.sh` backward-compatibility script (see below)
8. Run evaluation suite (Tier 1 all prompts, Tier 2 key prompts, cross-platform validation)
9. Update README with new installation instructions
10. Release as v0.2.0

**Phase 2: Hooks + Additional Subagents**
1. Add `hooks.json` with Mermaid validation and audit format hooks
2. Create validation scripts (`validate-mermaid.sh`, `validate-audit-format.sh`)
3. Create `aidlc-reverse-engineer` and `aidlc-code-reviewer` subagents
4. Release as v0.3.0

### Evaluation Plan

"small wording tweaks produce large behavioral swings." This restructure changes the delivery format of every instruction file. Before shipping, we must validate that the skills-based delivery produces equivalent workflow behavior to the current rules-based delivery.

#### Evaluation methodology

**1. Reference prompt suite**: Create 6-8 reference prompts that exercise the key workflow paths:

| Prompt | Stages exercised | Key behaviors to verify |
|--------|-----------------|----------------------|
| "Using AI-DLC, add a login page to this React app" (brownfield) | 1, 2, 3, 4, 5, 6, 7, 8, 12, 13 | Brownfield detection, reverse engineering triggers, full inception |
| "Using AI-DLC, build a REST API for todo items" (greenfield) | 1, 3, 5, 12, 13 | Greenfield detection, minimal stages for simple request |
| "Using AI-DLC, refactor the database layer" (brownfield, no user-facing) | 1, 2, 3, 5, 12, 13 | User stories correctly skipped |
| "Using AI-DLC, build a multi-service e-commerce platform" (greenfield, complex) | 1, 3, 4, 5, 6, 7, 8-12 (×N units), 13 | Full workflow with multi-unit loop |
| `/aidlc-reverse-engineering` (standalone invocation) | 1, 2 | Independent stage invocation works |
| Resume from interrupted session (aidlc-state.md exists) | Varies | Session continuity preserved |

**2. Behavioral equivalence criteria**: For each reference prompt, compare the rules-based and skills-based runs on:

- **Stage selection**: Same stages executed/skipped (pass/fail)
- **Artifact production**: Same output files created in `aidlc-docs/` (pass/fail)
- **Approval gates**: Same number of user approval checkpoints (pass/fail)
- **Question quality**: Clarifying questions cover the same topics (subjective, scored 1-5)
- **Audit completeness**: Audit log captures all interactions (pass/fail)
- **Security enforcement**: When enabled, blocking findings are surfaced at the same points (pass/fail)

**3. Execution**: Run each reference prompt against both the current rules and the proposed skills on the same model (Claude Sonnet, as the most common user model). Compare outputs side by side. A prompt passes if all pass/fail criteria match and subjective scores are ≥3.

Full multi-turn workflow runs are expensive. To manage cost, the eval is split into two tiers:

- **Tier 1 — Automated (all 6 prompts)**: Run each prompt through the first 2-3 stages only and programmatically verify stage selection logic (which stages are invoked/skipped) and artifact creation (correct files exist in `aidlc-docs/`). This validates the orchestration map and skill loading without running full workflows.
- **Tier 2 — Full run (3 key prompts)**: Run prompts 1, 2, and 4 end-to-end and manually evaluate all criteria including question quality. These three cover brownfield/greenfield, simple/complex, and multi-unit paths.

**4. Cross-platform validation**: The agent-directed skill loading model (AGENTS.md instructs the agent to trigger skills by name) must be validated on at least one non-Claude-Code platform (Kiro or Cursor) to confirm that the orchestration map's sequential "invoke skill X" instructions are followed correctly. If a platform's skill loading is purely heuristic (platform decides, not agent), the orchestration map may need platform-specific adaptation. This must be confirmed before Phase 1 release.

**5. Regression detection**: After initial validation, the Tier 1 automated checks become a regression suite run in CI. The build-rules.sh backward-compatibility script is validated by running the same prompts against the generated rules output and confirming equivalence with the current rules.

#### Evaluation gates

- **Phase 1 release gate**: All 6 prompts pass Tier 1 automated checks. 3 key prompts pass Tier 2 full-run evaluation. Cross-platform validation passes on Claude Code + one skill-supporting platform (Kiro or Cursor).
- **Phase 2 release gate**: Hook enforcement must not produce false positives on the reference prompts (validation scripts don't block valid content).

### Backward Compatibility: Rules Generation Build

For platforms that only support rules (e.g., GitHub Copilot, Amazon Q with rules-only mode), a build script generates the current rules format from the skills source.

**`scripts/build-rules.sh`**:

The script performs the following transformations:

1. **AGENTS.md → `core-workflow.md`**: Takes the AGENTS.md orchestration map and expands each "invoke skill X" step into the current inline execution block format (load steps, log audit, execute, wait for approval, log response). This is a template expansion: for each skill reference, inject a ~15-line block using the skill's description and output paths.

2. **Skills → detail files**: Strips YAML frontmatter from each SKILL.md and copies the body to the appropriate platform path (`aws-aidlc-rule-details/inception/`, `aws-aidlc-rule-details/construction/`).

3. **Security subagent → extension file**: Extracts the security rules section from `agents/aidlc-security-reviewer.md` and writes it to `aws-aidlc-rule-details/extensions/security/baseline/security-baseline.md`.

4. **Platform-specific output**: Generates the complete directory structure for each supported platform:
   - `.kiro/steering/` + `.kiro/aws-aidlc-rule-details/`
   - `.amazonq/rules/` + `.amazonq/aws-aidlc-rule-details/`
   - `.cursor/rules/` (with YAML frontmatter wrapping)
   - `CLAUDE.md` + `.aidlc-rule-details/`
   - `.github/copilot-instructions.md` + `.aidlc-rule-details/`

The build script is tested in CI: the release pipeline runs `build-rules.sh`, and a validation step confirms the generated output matches the expected structure and contains all stage content.

**The release pipeline produces two artifacts:**
- `ai-dlc-skills-vX.X.X.zip` — the new skills-based structure (for Claude Code, Kiro, Cursor)
- `ai-dlc-rules-vX.X.X.zip` — the generated rules structure (backward compatibility)

### Platform Compatibility

| Platform | Primary delivery | Mechanisms supported |
|----------|-----------------|---------------------|
| **Claude Code** | Skills + subagents + hooks | All 4 mechanisms |
| **Kiro** | Skills + subagents + hooks | All 4 mechanisms |
| **Cursor** | Skills + subagents + hooks | All 4 mechanisms |
| **Amazon Q Developer** | Generated rules (build script) | Rules only |
| **GitHub Copilot** | Generated rules (build script) | Rules only |
| **Cline** | Skills (via AGENTS.md + skills/) | Skills. No hooks/subagents — uses `aidlc-security-check` fallback skill. |

For rules-only platforms, the generated output is functionally identical to the current delivery. No degradation occurs. For skills-capable platforms without subagent support (Kiro, Cline), the `aidlc-security-check` fallback skill ensures security enforcement is preserved, though the security baseline loads into the main context (~3,600 tokens) rather than being isolated in a subagent.

### Alternatives Considered

### Alternative 1: Keep rules-only, split into smaller files

Split `core-workflow.md` into smaller rule files with more granular auto/manual loading. This reduces per-file size but doesn't address the core problems: always-loaded rules consume the same total context, stages can't be invoked independently, no automated enforcement, no cross-platform standard for the split.

### Alternative 2: Skills only (no subagents or hooks)

Convert stages to skills and keep security as a skill. Simpler, but: (a) security as a skill means it must be explicitly loaded at each stage boundary, breaking the cross-cutting enforcement model; (b) no deterministic validation—Mermaid and audit format checking remain AI instructions.


### Drawbacks

1. **Increased structural complexity**: The rules-only model is one directory of markdown files. The proposed model has 4 directories (skills, agents, hooks, scripts) plus AGENTS.md. Contributors need to understand which mechanism to use for new content.

2. **Skill content accumulation**: Skills do not unload from context. A full 13-stage workflow accumulates ~22,290 tokens of stage skill content plus ~2,130 tokens for the orchestrator. Combined with AGENTS.md (~980 tokens), skill descriptions (~1,600 tokens), and subagent descriptions (~400 tokens), this totals ~27,400 tokens in the main context—a 33% reduction from the current ~40,820 worst case, but not the 75% session-start improvement. The benefit is most pronounced for partial workflows (45% savings on a 3-stage bug fix) and standalone invocations (65% savings). The `aidlc-workflow-changes` skill (~2,030 tokens) and `aidlc-security-check` fallback skill (~3,600 tokens) are only loaded when triggered by a workflow change request or on platforms without subagent support, respectively — they do not contribute to the typical workflow accumulation.

3. **Common content duplication**: Absorbing Group 3 common files into individual skills adds ~540 tokens of net duplication overhead (ASCII standards in 4 skills, depth levels in 2 skills, partially offset by error-handling being shorter per skill than the monolithic file). This is a modest cost for skill independence and is quantified in the skills table.

4. **Build script maintenance**: The backward-compatibility build script must be updated whenever skills are added or modified. This is additional CI/release engineering work.

5. **Platform disparity**: Users on Claude Code, Kiro and Cursor get the full experience (skills + subagents + hooks). Users on platforms without subagent support (Cline) get a skill-based security fallback that loads the security baseline into the main context (~3,600 tokens) instead of isolating it in a subagent. Users on rules-only platforms get a generated version that is functionally identical to today but doesn't benefit from subagent isolation or hook enforcement. This creates a three-tier experience, though all tiers preserve security enforcement.

### Additional Context

- Agent Skills specification: [agentskills.io](https://agentskills.io) — adopted by Claude Code, Cursor, Kiro, Gemini CLI, JetBrains Junie, GitHub, VS Code, OpenAI Codex, and 20+ others
- Current AI-DLC version: 0.1.5
- Current delivery: ZIP from GitHub Releases containing `aidlc-rules/` directory
- The Operations phase (currently a placeholder) would be added as a new skill when implemented, without modifying the orchestrator—demonstrating the extensibility benefit of this architecture

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC]: Restructure AI-DLC workflows #105

Summary

Motivation

Current State: Measured Context Consumption

Problems with the rules-only approach

Detailed Proposal

Skill Context Lifecycle: What Actually Happens

How skill invocation works mechanically

New Directory Structure

Mechanism 1: Rules (AGENTS.md) — Always-on constraints

Orchestration Design: The `aidlc` Orchestrator Skill

Mechanism 2: Skills — Stage procedures with deferred loading

Common Content Distribution

Mechanism 3: Subagents — Context-isolated specialized work

Mechanism 4: Hooks — Deterministic quality enforcement

Context Budget Analysis: Before and After

Baseline comparison

Scenario-based analysis

Migration Path

Evaluation Plan

Evaluation methodology

Evaluation gates

Backward Compatibility: Rules Generation Build

Platform Compatibility

Alternatives Considered

Alternative 1: Keep rules-only, split into smaller files

Alternative 2: Skills only (no subagents or hooks)

Drawbacks

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

File	Lines	Words	Est. Tokens
`core-workflow.md`	532	3,271	~4,250
`process-overview.md`	140	733	~950
`question-format-guide.md`	332	1,343	~1,750
`session-continuity.md`	46	336	~440
`content-validation.md`	78	376	~490
`welcome-message.md`	109	602	~780
`security-baseline.md`	323	2,601	~3,380
Total always loaded	1,560	9,262	~12,040

Group	Lines	Words	Est. Tokens
Inception stages (7 files)	1,722	8,824	~11,470
Construction stages (6 files)	973	5,050	~6,570
Total on demand	2,695	13,874	~18,040

File	Lines	Words	Est. Tokens
`error-handling.md`	373	1,833	~2,380
`workflow-changes.md`	285	1,561	~2,030
`terminology.md`	189	925	~1,200
`ascii-diagram-standards.md`	116	367	~480
`overconfidence-prevention.md`	99	591	~770
`depth-levels.md`	73	382	~500
Total other common	1,135	5,659	~7,360

Platform	Skill descriptions at startup	Full content on invocation	`context: fork` isolation
Claude Code	Yes (2% context budget)	Yes	Yes
Kiro	Yes	Yes	No
Cursor	Yes	Yes	No
30+ others via agentskills.io	Yes	Yes	Varies

Section	Source	Est. Lines	Purpose
Adaptive workflow principle	`core-workflow.md` lines 1-9	10	Core philosophy
Directory structure	`core-workflow.md` lines 502-532	20	`aidlc-docs/` layout
Terminology glossary	`common/terminology.md`	20	Key terms (subset)
Overconfidence prevention	`common/overconfidence-prevention.md`	15	"When in doubt, ask" guidelines
Approval gate protocol	`core-workflow.md` lines 442-454	15	Standardized 2-option messages, audit requirements
Content validation basics	`common/content-validation.md`	15	Mermaid/ASCII rules (brief; hooks enforce)
Workflow change trigger	`common/workflow-changes.md`	3	One-line trigger: "When the user requests a workflow change (go back, skip, restart, add/remove stage or unit, change depth), invoke `aidlc-workflow-changes`." Full procedures live in the `aidlc-workflow-changes` skill.
Error recovery basics	`common/error-handling.md`	10	Severity levels, general recovery principle (detail in skills)
Shared procedures	`common/question-format-guide.md`	15	Condensed question format (see "Common Content" below)
Session state detection	—	3	Always-on rule: "If `aidlc-docs/aidlc-state.md` exists, inform user and suggest `/aidlc` to resume or a specific stage skill." Prevents silent state loss on session resume.
Total		~125

Skill	Source File	Base Tokens	Absorbed from	Est. Total Tokens	`context: fork`
`aidlc` (orchestrator)	`core-workflow.md` + `welcome-message.md` + `session-continuity.md`	~2,130	—	~2,130	No
`aidlc-workspace-detection`	`inception/workspace-detection.md`	~520	error-handling (~150)	~700	No
`aidlc-reverse-engineering`	`inception/reverse-engineering.md`	~1,300	error-handling (~150), ascii-diagrams (~250)	~1,730	Optional
`aidlc-requirements-analysis`	`inception/requirements-analysis.md`	~1,310	error-handling (~150), depth-levels (~500)	~1,990	No
`aidlc-user-stories`	`inception/user-stories.md`	~2,910	error-handling (~150)	~3,090	No
`aidlc-workflow-planning`	`inception/workflow-planning.md`	~2,520	error-handling (~150), depth-levels (~300), ascii-diagrams (~250)	~3,250	No
`aidlc-application-design`	`inception/application-design.md`	~1,240	error-handling (~150), ascii-diagrams (~250)	~1,670	No
`aidlc-units-generation`	`inception/units-generation.md`	~1,670	error-handling (~150)	~1,850	No
`aidlc-functional-design`	`construction/functional-design.md`	~1,010	error-handling (~150)	~1,190	No
`aidlc-nfr-requirements`	`construction/nfr-requirements.md`	~870	error-handling (~150)	~1,050	No
`aidlc-nfr-design`	`construction/nfr-design.md`	~680	error-handling (~150)	~860	No
`aidlc-infrastructure-design`	`construction/infrastructure-design.md`	~740	error-handling (~150), ascii-diagrams (~250)	~1,170	No
`aidlc-code-generation`	`construction/code-generation.md`	~1,820	error-handling (~200)	~2,050	No
`aidlc-build-and-test`	`construction/build-and-test.md`	~1,460	error-handling (~200)	~1,690	No
`aidlc-workflow-changes`	`common/workflow-changes.md`	~2,030	—	~2,030	No
`aidlc-security-check`	`extensions/security/baseline/security-baseline.md`	~3,380	evaluation procedure (~220)	~3,600	No
				~30,050

Current file	Tokens	Destination	Rationale
`core-workflow.md`	~4,250	AGENTS.md orchestration map (~450 tokens) + skills own their procedures	Orchestration logic condensed; stage detail delegated
`process-overview.md`	~950	Dropped (redundant with orchestration map)	The map IS the process overview
`question-format-guide.md`	~1,750	AGENTS.md (~200 token summary) + embedded in skills that ask questions (~150 tokens each in requirements-analysis, user-stories, workflow-planning)	Multiple-choice format and `[Answer]:` tag condensed
`session-continuity.md`	~440	Embedded in `aidlc` orchestrator skill (resume detection at workflow start)	Orchestrator checks for aidlc-state.md and offers to resume
`content-validation.md`	~490	AGENTS.md (~200 token brief) + hooks enforce deterministically	Rules state the principle; hooks enforce it
`welcome-message.md`	~780	Embedded in `aidlc` orchestrator skill (displayed at workflow start)	Loaded only when full workflow is invoked, not always-on
`security-baseline.md`	~3,380	`aidlc-security-reviewer` subagent (embedded in full) + `aidlc-security-check` fallback skill (for non-subagent platforms)	Cross-cutting concern isolated in subagent where supported; fallback skill ensures enforcement on all platforms

Current file	Tokens	Destination	Rationale
`error-handling.md`	~2,380	AGENTS.md (~130 token summary: severity levels, general recovery principle) + stage-specific recovery procedures embedded in each skill (~150-200 tokens per skill)	General principle is permanent; per-stage recovery is stage-specific
`workflow-changes.md`	~2,030	AGENTS.md (~40 token trigger) + `aidlc-workflow-changes` skill (full procedures, deferred loading)	Cross-cutting but infrequent; one-line trigger in AGENTS.md invokes the skill only when the user requests a change
`depth-levels.md`	~500	Embedded in `aidlc-requirements-analysis` (~500 tokens) and `aidlc-workflow-planning` (~300 tokens)	Only these two skills use adaptive depth selection
`terminology.md`	~1,200	AGENTS.md (~260 token subset: Phase, Stage, Unit of Work, Service, Component, Planning vs Generation)	Core terms only; full glossary available in skills
`overconfidence-prevention.md`	~770	AGENTS.md (~200 tokens condensed)	Always-relevant behavioral constraint
`ascii-diagram-standards.md`	~480	Embedded in skills that produce diagrams: reverse-engineering, application-design, workflow-planning, infrastructure-design (~250 tokens each) + hooks validate	Only 4 skills produce ASCII diagrams

Subagent	Purpose	Context isolation needed because	When invoked
`aidlc-security-reviewer`	Evaluates all 15 SECURITY rules against stage artifacts. Returns a compliance report with compliant/non-compliant/N/A per rule.	The security baseline is 3,380 tokens of cross-cutting rules. Loading it into the main context at every stage boundary would accumulate. Running it in a subagent keeps the main context clean.	At each stage completion when security extension is enabled
`aidlc-reverse-engineer`	Analyzes large codebases. Produces 8 reverse engineering artifacts (business overview, architecture, code structure, APIs, components, tech stack, dependencies, code quality).	Large codebase analysis can consume 50k+ tokens of code context. Isolating this prevents the main conversation from being overwhelmed.	When `aidlc-reverse-engineering` skill detects a large codebase (>500 files)
`aidlc-code-reviewer`	Reviews generated code against requirements documents, design docs, and security rules.	Independent review benefits from a fresh context that isn't anchored to the generation process.	After code generation, before build-and-test

Metric	Current (rules-only)	Proposed	Difference
Always-on context at session start	~12,040 tokens	~3,000 tokens	-75%
After full-workflow invocation (`/aidlc`)	~12,040 tokens	~5,130 tokens (+ orchestrator skill)	-57%
Security rules in main context	~3,380 tokens (always)	0 tokens (subagent)	-100%
Total stage skill content (13 stages)	~18,040 tokens	~22,290 tokens	+24% (absorbed common content)
Orchestrator skill	(in core-workflow.md)	~2,130 tokens	Extracted from AGENTS.md

	Current	Proposed
Always-on	~12,040	~3,000
Orchestrator	(included above)	~2,130
Stage content loaded	~3,650 (3 detail files)	~4,740 (3 skills with absorbed content)
Common files loaded	~2,380 (error-handling)	0 (embedded in skills)
Security (if disabled)	0	0
Total in main context	~18,070	~9,870

Prompt	Stages exercised	Key behaviors to verify
"Using AI-DLC, add a login page to this React app" (brownfield)	1, 2, 3, 4, 5, 6, 7, 8, 12, 13	Brownfield detection, reverse engineering triggers, full inception
"Using AI-DLC, build a REST API for todo items" (greenfield)	1, 3, 5, 12, 13	Greenfield detection, minimal stages for simple request
"Using AI-DLC, refactor the database layer" (brownfield, no user-facing)	1, 2, 3, 5, 12, 13	User stories correctly skipped
"Using AI-DLC, build a multi-service e-commerce platform" (greenfield, complex)	1, 3, 4, 5, 6, 7, 8-12 (×N units), 13	Full workflow with multi-unit loop
`/aidlc-reverse-engineering` (standalone invocation)	1, 2	Independent stage invocation works
Resume from interrupted session (aidlc-state.md exists)	Varies	Session continuity preserved

Platform	Primary delivery	Mechanisms supported
Claude Code	Skills + subagents + hooks	All 4 mechanisms
Kiro	Skills + subagents + hooks	All 4 mechanisms
Cursor	Skills + subagents + hooks	All 4 mechanisms
Amazon Q Developer	Generated rules (build script)	Rules only
GitHub Copilot	Generated rules (build script)	Rules only
Cline	Skills (via AGENTS.md + skills/)	Skills. No hooks/subagents — uses `aidlc-security-check` fallback skill.

[RFC]: Restructure AI-DLC workflows #105

Description

Summary

Motivation

Current State: Measured Context Consumption

Problems with the rules-only approach

Detailed Proposal

Skill Context Lifecycle: What Actually Happens

How skill invocation works mechanically

New Directory Structure

Mechanism 1: Rules (AGENTS.md) — Always-on constraints

Orchestration Design: The aidlc Orchestrator Skill

Mechanism 2: Skills — Stage procedures with deferred loading

Common Content Distribution

Mechanism 3: Subagents — Context-isolated specialized work

Mechanism 4: Hooks — Deterministic quality enforcement

Context Budget Analysis: Before and After

Baseline comparison

Scenario-based analysis

Migration Path

Evaluation Plan

Evaluation methodology

Evaluation gates

Backward Compatibility: Rules Generation Build

Platform Compatibility

Alternatives Considered

Alternative 1: Keep rules-only, split into smaller files

Alternative 2: Skills only (no subagents or hooks)

Drawbacks

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Orchestration Design: The `aidlc` Orchestrator Skill