ci(design): extract design documents#1212
Conversation
|
Important Review skippedThis PR was authored by the user configured for CodeRabbit reviews. CodeRabbit does not review PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback. ⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds a Design Extraction skill and a GitHub Actions workflow that locate a PR design plan, aggregate PR body/comments/reviews, invoke Claude to generate ADR markdown files under Changes
Sequence DiagramsequenceDiagram
actor Developer as Developer
participant GitHub as GitHub
participant Workflow as "GitHub Actions"
participant API as "GitHub API"
participant Git as "Git Operations"
participant Claude as "Claude CLI"
Developer->>GitHub: Open PR / Approve review
GitHub->>Workflow: Trigger workflow
Workflow->>API: Resolve PR details (number, SHAs, refs, labels, body)
API-->>Workflow: PR metadata
Workflow->>Workflow: Guards (fork, existing label, delimiter collisions)
Workflow->>Git: Checkout PR head and trusted base (sparse)
Workflow->>Workflow: Extract design plan from PR body
Workflow->>API: Fetch PR comments & reviews (paginated)
API-->>Workflow: Aggregated comments blob
Workflow->>Claude: Send extraction prompt (plan + comments)
Claude-->>Workflow: Generated ADR files in design/docs/
Workflow->>Workflow: Validate changed paths (only design/docs/)
alt changes valid
Workflow->>Git: Commit & push ADRs to PR head
Workflow->>API: Apply `design-extracted` label
API-->>GitHub: Label updated
else invalid changes
Workflow->>Workflow: Fail workflow
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. 📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/design-extraction/SKILL.md:
- Around line 109-123: The fenced code block showing the CodeRabbit comment (the
triple-backtick block that begins with "⚠️ Potential issue | 🟡 Minor") should
include a language specifier to improve rendering consistency; update that
opening fence from ``` to ```text (or ```markdown) so the block reads as a
labeled code block and keep the content unchanged inside the block.
In @.github/workflows/design-extraction.yml:
- Around line 199-209: The validation step "Validate output" currently excludes
.claude/skills/ from the INVALID detection (see the INVALID assignment and the
grep -v '^\.claude/skills/'); remove that exclusion so any changes outside
design/docs/ — including .claude/skills/ — are treated as invalid, or explicitly
add a guard that fails if git diff or git ls-files shows changes under
.claude/skills/ (i.e., update the INVALID command or add a separate check
referencing the same git commands and the .claude/skills/ path) so only files in
design/docs/ are permitted.
- Around line 142-156: The current curl queries that populate COMMENTS,
REVIEW_COMMENTS, and REVIEWS use per_page=100 and will drop results when there
are more than 100 items; update these three fetches to handle pagination (either
replace the curl+jq calls with gh api --paginate for each endpoint or implement
Link-header pagination loop with curl) so all issue comments, PR review
comments, and reviews are aggregated; ensure the final variable assembly and jq
processing remain the same (still producing the combined markdown strings) after
switching to paginated fetching.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f7b26458-fe99-4daa-a99c-de854982f5ac
📒 Files selected for processing (2)
.claude/skills/design-extraction/SKILL.md.github/workflows/design-extraction.yml
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/design-extraction.yml:
- Around line 87-94: Remove the nested "Checkout trusted skill from base branch"
checkout (the step that uses actions/checkout@v4 with path: trusted-base and
sparse-checkout: .claude/skills/design-extraction) that runs before the output
validation; this nested checkout causes git ls-files --others --exclude-standard
to see trusted-base/ as untracked and fail validation. Move or remove that
checkout so validation runs against only the main worktree (e.g., perform the
trusted-base checkout after validation or use a checkout strategy that does not
place files inside the current worktree), and ensure any references to
trusted-base in subsequent steps are updated accordingly.
- Around line 234-243: The label step is conditioned on
steps.design-plan.outputs.content but can run even when no files were committed;
update the "Commit and push" step (or its equivalent) to detect whether a commit
actually occurred (reuse the existing git diff --staged --quiet check) and set a
step output like design_changed=true only when git commit/push runs, then change
the label step's if condition to depend on that output (e.g.,
steps.commit-and-push.outputs.design_changed == 'true') so the label is only
applied when ADR files were actually written; apply the same pattern to the
other similar commit/label block as well.
- Around line 118-120: The workflow's PR_BODY env is only set from
steps.pr.outputs.body so comment-only design plans posted as issue/review
comments (steps.design-plan.outputs.content) are ignored; update the env
assignment for PR_BODY to prefer/merge steps.design-plan.outputs.content when it
is non-empty (fall back to steps.pr.outputs.body /
github.event.pull_request.body otherwise) and also adjust the gating condition
that checks steps.design-plan.outputs.content so it accepts the combined PR_BODY
or checks both sources (references: PR_BODY, steps.pr.outputs.body,
steps.design-plan.outputs.content and the gate around the step at the later
block currently checking steps.design-plan.outputs.content).
- Around line 106-114: The "Install trusted skill" step currently blindly
removes and copies to .claude/skills/design-extraction and can follow symlinks;
before running rm -rf or cp -r, reject any symlinked paths by verifying .claude
and .claude/skills are not symlinks (use a test like -L) and/or resolve their
absolute paths (readlink -f) and confirm they reside inside the current checkout
(GITHUB_WORKSPACE); if either check fails, abort this step with a warning
instead of performing rm or cp to avoid deleting or writing outside the repo.
- Around line 142-168: paginate_curl builds an invalid JSON array by streaming
each item with jq -c '.[]' into a temp file that only inserts commas between
pages, not items; fix it by stopping the per-item emit and instead append each
page as a JSON array (preserve the raw "$body" for each page) and then combine
pages with jq -s 'add' (or equivalent) to flatten arrays correctly; update calls
inside paginate_curl (replace the jq -c '.[]' write and the final jq -s
'flatten' step) so the temp file contains valid JSON arrays per page and they
are merged safely at the end.
- Around line 121-126: The embedded Python passed to `python3 -c` (used to set
PLAN) is indented and will raise an IndentationError; replace this with a
heredoc invocation so the script is provided as a raw multi-line block (read
from stdin) instead of a quoted string. Update the block that sets PLAN (the
command using `python3 -c` and referencing os.environ['PR_BODY'] and the regex
search for '<!-- design plan -->...<!-- end of design plan -->') to use a
heredoc to feed the same Python code (no leading-space issues) and preserve the
existing logic that prints m.group(1).strip() if m else body.strip().
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5dc140be-827b-47c8-91ab-5cde926ce04e
📒 Files selected for processing (2)
.claude/skills/design-extraction/SKILL.md.github/workflows/design-extraction.yml
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/design-extraction/SKILL.md:
- Around line 42-45: The pagination doc is incomplete: update the fetch flow for
the three sources
(`https://api.github.com/repos/{owner}/{repo}/issues/<pr-number>/comments?per_page=100`,
`https://api.github.com/repos/{owner}/{repo}/pulls/<pr-number>/comments?per_page=100`,
and
`https://api.github.com/repos/{owner}/{repo}/pulls/<pr-number>/reviews?per_page=100`)
to explicitly show capturing the HTTP response headers, parsing the Link header
for rel="next", and looping until no next link; specify example pseudocode
steps: perform request, read the response `Link` header, extract the URL for
rel="next" (if present), append results, then request the next URL repeatedly
until `rel="next"` is absent, and mention handling rate limits / empty Link
gracefully.
- Around line 175-180: Update the SKILL.md policy to explicitly require a
validation/enforcement step that counts lines before performing create/append
actions: for "create" ensure the new ADR content is <=500 lines and reject or
split into multiple cross-referenced files if it exceeds the limit; for "append"
compute current file length + new section length and either split the appended
content into new files with cross-references or reject the append if it would
exceed 500 lines. Add wording describing the enforcement behavior and expected
error/response when the limit is exceeded so implementations of the "create" and
"append" actions know to validate line counts prior to writing.
- Around line 95-104: Add a redaction guard to the PR comment processing flow
described under "Analyse PR Comments" so that any content written to ADRs
(design/docs/) is sanitized: before persisting, scan the PR comment body for
secrets/tokens/credentials/PII and either redact or remove matching values, and
ensure the existing "CodeRabbit Comment Filtering" logic still strips scaffolded
sections and only keeps the issue description; implement this sanitization step
immediately prior to the write-to-disk routine that creates files in
design/docs/ so no raw sensitive values are committed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0893b7e9-4c9f-4558-b7e5-03310018fce6
📒 Files selected for processing (1)
.claude/skills/design-extraction/SKILL.md
There was a problem hiding this comment.
♻️ Duplicate comments (2)
.github/workflows/design-extraction.yml (2)
126-131:⚠️ Potential issue | 🔴 CriticalThe embedded Python code will fail with
IndentationErrordue to leading whitespace.The string passed to
python3 -cpreserves YAML indentation, causing Python to see leading spaces beforeimport os, re. This will fail at runtime with "IndentationError: unexpected indent".🐛 Proposed fix using heredoc
- PLAN=$(python3 -c " - import os, re - body = os.environ['PR_BODY'] - m = re.search(r'<!--\s*design plan\s*-->(.*?)<!--\s*end of design plan\s*-->', body, re.DOTALL) - print(m.group(1).strip() if m else body.strip()) - ") + PLAN=$(python3 <<'PY' +import os, re +body = os.environ['PR_BODY'] +m = re.search(r'<!--\s*design plan\s*-->(.*?)<!--\s*end of design plan\s*-->', body, re.DOTALL) +print(m.group(1).strip() if m else body.strip()) +PY + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/design-extraction.yml around lines 126 - 131, The embedded Python snippet assigned to PLAN is failing due to preserved YAML indentation causing an IndentationError; replace the inline python3 -c string with a heredoc so Python sees no leading spaces: run python3 with a here-document (e.g., python3 <<'PY' ... PY) or use a YAML block scalar that left-aligns the code, keep the same logic (import os, re; read os.environ['PR_BODY']; regex search) and print the matched group or body, and capture its stdout into PLAN to preserve behavior; update the invocation that sets PLAN to use this heredoc-based python invocation.
121-139: 🧹 Nitpick | 🔵 TrivialDesign plans in PR comments are not extracted.
The design plan extraction only searches
PR_BODYfor<!-- design plan -->markers. If a design plan is posted in a PR comment rather than the body, it will be included inPR_COMMENTSbut not recognized as the primary design source. This may be intentional, but consider documenting this behavior or extending the search to comments.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/design-extraction.yml around lines 121 - 139, The current extraction step only reads PR_BODY (steps.pr.outputs.body) into PLAN and runs the regex to find <!-- design plan --> blocks, so design plans posted in comments (e.g., PR_COMMENTS) are ignored; update the job to also read and search PR_COMMENTS (or concatenate steps.pr.outputs.comments with steps.pr.outputs.body) before running the regex, then apply the same regex logic to the combined text (the PLAN variable and regex/search block) and preserve the DELIM collision check and GITHUB_OUTPUT writes; alternatively document in the workflow that only the PR body is scanned if you choose not to change behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In @.github/workflows/design-extraction.yml:
- Around line 126-131: The embedded Python snippet assigned to PLAN is failing
due to preserved YAML indentation causing an IndentationError; replace the
inline python3 -c string with a heredoc so Python sees no leading spaces: run
python3 with a here-document (e.g., python3 <<'PY' ... PY) or use a YAML block
scalar that left-aligns the code, keep the same logic (import os, re; read
os.environ['PR_BODY']; regex search) and print the matched group or body, and
capture its stdout into PLAN to preserve behavior; update the invocation that
sets PLAN to use this heredoc-based python invocation.
- Around line 121-139: The current extraction step only reads PR_BODY
(steps.pr.outputs.body) into PLAN and runs the regex to find <!-- design plan
--> blocks, so design plans posted in comments (e.g., PR_COMMENTS) are ignored;
update the job to also read and search PR_COMMENTS (or concatenate
steps.pr.outputs.comments with steps.pr.outputs.body) before running the regex,
then apply the same regex logic to the combined text (the PLAN variable and
regex/search block) and preserve the DELIM collision check and GITHUB_OUTPUT
writes; alternatively document in the workflow that only the PR body is scanned
if you choose not to change behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6abcd407-ab6d-47c6-9d67-0f4fd1be7470
📒 Files selected for processing (1)
.github/workflows/design-extraction.yml
Describe the changes
A clear and concise description of what the bug fix, feature, or improvement is.
Related Issue(s)
Please link to the issue(s) that will be closed with this PR.
Checklist
Additional Context
Add any other context about the pull request here.
Summary by CodeRabbit
New Features
Documentation