|
| 1 | +--- |
| 2 | +name: design-extraction |
| 3 | +description: Extract architectural and design decisions from a GitHub PR into ADR (Architecture Decision Record) files in design/docs/. Use this skill whenever the user wants to extract design decisions from a PR, create ADRs from PR discussions, document architectural choices from pull requests, or mentions "design extraction" or "design decisions" in the context of a PR. Also triggers for "extract ADRs", "document decisions from PR", or "pull design docs from PR". |
| 4 | +--- |
| 5 | + |
| 6 | +# Design Extraction |
| 7 | + |
| 8 | +Extract architectural and design decisions from a GitHub PR and write them as ADR files in `design/docs/`. |
| 9 | + |
| 10 | +## Command |
| 11 | + |
| 12 | +```text |
| 13 | +/design-extraction <pr-number> |
| 14 | +``` |
| 15 | + |
| 16 | +## Arguments |
| 17 | + |
| 18 | +- `pr-number` (required): The GitHub PR number to extract decisions from. |
| 19 | + |
| 20 | +## What This Does |
| 21 | + |
| 22 | +PRs often contain important architectural decisions buried in descriptions and comment threads. This skill pulls those decisions out and writes them as structured ADR (Architecture Decision Record) files — short documents that capture *what* was decided, *why*, and *what follows from it*. Future contributors can then understand the reasoning behind the system's design without trawling through old PRs. |
| 23 | + |
| 24 | +## Execution Protocol |
| 25 | + |
| 26 | +### 1. Resolve PR Details |
| 27 | + |
| 28 | +Fetch the PR metadata, body, and all comments using curl and jq (the `gh` CLI is not available on the runner): |
| 29 | + |
| 30 | +```bash |
| 31 | +curl -fsSL \ |
| 32 | + -H "Authorization: Bearer $GH_TOKEN" \ |
| 33 | + -H "Accept: application/vnd.github+json" \ |
| 34 | + "https://api.github.com/repos/{owner}/{repo}/pulls/<pr-number>" |
| 35 | +``` |
| 36 | + |
| 37 | +Extract: |
| 38 | +- PR number, title |
| 39 | +- PR body (the description) |
| 40 | +- Head ref, base ref |
| 41 | + |
| 42 | +Then fetch all three comment sources (paginate by capturing response headers with `curl -D`, extracting the `Link: <url>; rel="next"` URL, and repeating until no `rel="next"` link is present): |
| 43 | +- **Issue comments**: `https://api.github.com/repos/{owner}/{repo}/issues/<pr-number>/comments?per_page=100` |
| 44 | +- **Review comments** (inline on code): `https://api.github.com/repos/{owner}/{repo}/pulls/<pr-number>/comments?per_page=100` |
| 45 | +- **Review bodies**: `https://api.github.com/repos/{owner}/{repo}/pulls/<pr-number>/reviews?per_page=100` |
| 46 | + |
| 47 | +Display the PR title and URL for context. |
| 48 | + |
| 49 | +### 2. Extract the Design Plan |
| 50 | + |
| 51 | +Search for a design plan between HTML comment markers in **both** the PR body and all PR comments: |
| 52 | + |
| 53 | +```html |
| 54 | +<!-- design plan --> |
| 55 | +... design content here ... |
| 56 | +<!-- end of design plan --> |
| 57 | +``` |
| 58 | + |
| 59 | +Check for these markers in this order: |
| 60 | +1. **PR body** — check first |
| 61 | +2. **Issue comments** — check all comments chronologically |
| 62 | +3. **Review comments** (inline on code) — check all |
| 63 | +4. **Review bodies** — check all |
| 64 | + |
| 65 | +Extract content from **every** location where markers are found. If markers appear in multiple places, concatenate the extracted sections in the order listed above — later sections may refine or extend earlier ones. |
| 66 | + |
| 67 | +This combined extracted content is the **primary** source of decisions. |
| 68 | + |
| 69 | +If no design plan markers exist anywhere, use the full PR body as the source material — but apply a higher bar for what counts as a "decision" (skip vague descriptions, feature lists without rationale, etc.). |
| 70 | + |
| 71 | +### 3. Assess Whether There Are Decisions to Extract |
| 72 | + |
| 73 | +If the design plan (or PR body) contains no concrete decisions — it's a placeholder like "TBD", "TODO", "see Slack", "WIP", or just a feature description without architectural rationale — stop and tell the user: |
| 74 | + |
| 75 | +```text |
| 76 | +No actionable design decisions found in PR #<number>. |
| 77 | +``` |
| 78 | + |
| 79 | +A "decision" means a deliberate choice between alternatives with stated rationale — not just a description of what was built. |
| 80 | + |
| 81 | +### 4. Read Existing Design Docs |
| 82 | + |
| 83 | +```text |
| 84 | +Glob: design/docs/*.md |
| 85 | +``` |
| 86 | + |
| 87 | +Read every existing file to understand what's already documented. This is essential for deduplication and for deciding whether to create new files or append to existing ones. |
| 88 | + |
| 89 | +If `design/docs/` doesn't exist, create it: |
| 90 | + |
| 91 | +```bash |
| 92 | +mkdir -p design/docs |
| 93 | +``` |
| 94 | + |
| 95 | +### 5. Analyse PR Comments |
| 96 | + |
| 97 | +PR comments may contain amendments, clarifications, or explicit decisions that refine or supersede the design plan. When processing comments: |
| 98 | + |
| 99 | +- **Redact sensitive data** before writing to ADR files — strip API keys, tokens, secrets, credentials, and PII. Never persist sensitive values to `design/docs/`. |
| 100 | +- **Override the plan** only when a comment contains a clear resolution, correction, or final decision (e.g., "We decided to go with X instead", "After discussion, the approach is Y") |
| 101 | +- **Ignore** questions, speculative remarks, and casual discussion |
| 102 | + |
| 103 | +#### CodeRabbit Comment Filtering |
| 104 | + |
| 105 | +Comments from `@coderabbitai` (CodeRabbit) have a specific structure. Only the **issue description** at the top of the comment is useful input — the rest is machine-generated scaffolding that must be stripped. Specifically: |
| 106 | + |
| 107 | +- **Keep**: The initial description of the issue (everything before the first section marker below) |
| 108 | +- **Strip entirely**: |
| 109 | + - `🧩 Analysis chain` section and all content under it |
| 110 | + - `🤖 Prompt for AI Agents` section and all content under it |
| 111 | + - Any `🏁 Scripts executed` section |
| 112 | + |
| 113 | +For example, given a CodeRabbit comment like: |
| 114 | + |
| 115 | +```text |
| 116 | +⚠️ Potential issue | 🟡 Minor |
| 117 | +
|
| 118 | +Improve handling of large integers in JSON-to-TOML conversion. |
| 119 | +
|
| 120 | +The fallback to as_f64() can lose precision for large integers... |
| 121 | +
|
| 122 | +🧩 Analysis chain |
| 123 | +<... strip everything from here down ...> |
| 124 | +
|
| 125 | +🤖 Prompt for AI Agents |
| 126 | +<... strip everything from here down ...> |
| 127 | +``` |
| 128 | + |
| 129 | +Only feed the text **above** the first `🧩` or `🤖` marker into the decision extraction pipeline. The analysis chain and AI agent prompts are CodeRabbit internals, not human design decisions. |
| 130 | + |
| 131 | +#### Other Bot Comments |
| 132 | + |
| 133 | +Ignore comments from other bots (GitHub Actions, CI bots, etc.) unless they contain design plan markers (`<!-- design plan -->`). |
| 134 | + |
| 135 | +### 6. Deduplicate |
| 136 | + |
| 137 | +Before writing anything, check each extracted decision against existing design docs. Match by **topic and substance** — if the same decision exists with different wording, skip it. This prevents duplication when the skill is run multiple times or across related PRs. |
| 138 | + |
| 139 | +### 7. Determine Actions |
| 140 | + |
| 141 | +For each decision or coherent set of related decisions: |
| 142 | + |
| 143 | +- **New distinct topic** → `create` a new file |
| 144 | +- **Extends an existing doc** → `append` to that file |
| 145 | +- **Relates to multiple existing docs** → update the most closely related one and add cross-references: `See also: [Related Topic](related-topic.md)` |
| 146 | + |
| 147 | +### 8. Write ADR Files |
| 148 | + |
| 149 | +Each ADR follows this structure: |
| 150 | + |
| 151 | +```markdown |
| 152 | +# <Descriptive Title> |
| 153 | + |
| 154 | +## Status |
| 155 | +Accepted |
| 156 | + |
| 157 | +## Context |
| 158 | +<What problem or requirement prompted this decision> |
| 159 | + |
| 160 | +## Decision |
| 161 | +<What was decided and why> |
| 162 | + |
| 163 | +## Consequences |
| 164 | +- <Positive, negative, and neutral consequences> |
| 165 | + |
| 166 | +## Source |
| 167 | +PR #<number> — <title> |
| 168 | +``` |
| 169 | + |
| 170 | +**Filename rules:** |
| 171 | +- Kebab-case derived from the topic (e.g., `authentication-strategy.md`, `data-pipeline-architecture.md`) |
| 172 | +- No dates or sequence numbers |
| 173 | +- Only lowercase letters, digits, and hyphens; must start with a letter or digit |
| 174 | +- Must match: `^[a-z0-9][a-z0-9-]*\.md$` |
| 175 | + |
| 176 | +**Content limits:** |
| 177 | +- 500 lines maximum per file. If a topic needs more, split into multiple files with cross-references. |
| 178 | + |
| 179 | +Before any write or append, verify the resulting file will not exceed 500 lines. If it would, split the content into a new related ADR file with cross-references instead. |
| 180 | + |
| 181 | +**For `create` actions:** Write the full ADR to a new file in `design/docs/`. |
| 182 | + |
| 183 | +**For `append` actions:** Add two blank lines then the new ADR section to the end of the existing file. |
| 184 | + |
| 185 | +### 9. Present Summary |
| 186 | + |
| 187 | +After writing, display: |
| 188 | + |
| 189 | +```text |
| 190 | +Design Extraction Complete — PR #<number>: <title> |
| 191 | +
|
| 192 | + Created: <list of new files> |
| 193 | + Updated: <list of appended files> |
| 194 | + Skipped: <count> (already documented) |
| 195 | +
|
| 196 | +Summary: <one-line description of what was extracted> |
| 197 | +``` |
| 198 | + |
| 199 | +## ADR Example |
| 200 | + |
| 201 | +For reference, here's what a well-formed ADR looks like: |
| 202 | + |
| 203 | +```markdown |
| 204 | +# Tmpfs for Test Isolation |
| 205 | + |
| 206 | +## Status |
| 207 | +Accepted |
| 208 | + |
| 209 | +## Context |
| 210 | +Test runners share the host filesystem, causing test pollution between concurrent jobs. |
| 211 | + |
| 212 | +## Decision |
| 213 | +Mount an 8 GB tmpfs at `/workspace/tmp` for each test runner container. Size is configurable via the `TMPFS_SIZE` environment variable. Regular storage at `/workspace` is preserved for cross-step artifacts. |
| 214 | + |
| 215 | +## Consequences |
| 216 | +- Eliminates test pollution between concurrent jobs |
| 217 | +- RAM-backed storage improves I/O performance for test artifacts |
| 218 | +- Reduces available host memory by the configured tmpfs size per runner |
| 219 | + |
| 220 | +## Source |
| 221 | +PR #42 — Add tmpfs for test runners |
| 222 | +``` |
| 223 | + |
| 224 | +## Important Guidelines |
| 225 | + |
| 226 | +1. **Decisions, not descriptions**: Only extract deliberate architectural choices with rationale. "We use Redis" is not a decision. "We chose Redis over Memcached because we need pub/sub for real-time invalidation" is. |
| 227 | +2. **Preserve intent**: Capture the *why* faithfully. Don't paraphrase away the reasoning. |
| 228 | +3. **Be concise**: ADRs should be short and scannable. A few paragraphs per section, not essays. |
| 229 | +4. **No git operations**: Write files only. The user handles staging and committing. |
| 230 | +5. **Idempotent**: Running twice on the same PR should not create duplicates if the docs already exist. |
0 commit comments