Add copilot-pr-autopilot skill#1944
Open
yeelam-gordon wants to merge 67 commits into
Open
Conversation
Contributor
🔍 Skill Validator Results
Summary
Full validator output |
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds the new copilot-pr-autopilot skill, including PowerShell automation scripts and reference docs to drive a “request review → wait → triage → fix → reply/resolve → converge” loop using the gh CLI + GitHub GraphQL.
Changes:
- Introduces shared PowerShell helpers (
_lib.ps1) and several workflow scripts (request review, snapshot status, list threads, reply/resolve, cleanup outdated). - Adds detailed operational documentation (workflow, API quirks, triage rubric, reply templates) for consistent agent behavior.
- Registers the new skill in
docs/README.skills.md.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| skills/copilot-pr-autopilot/scripts/_lib.ps1 | Adds shared gh prerequisite checks and wrappers (Invoke-Gh, Invoke-GhGraphQL, Resolve-RepoCoords) used by all scripts. |
| skills/copilot-pr-autopilot/scripts/01-request-review.ps1 | Requests Copilot review via GraphQL and verifies via copilot_work_started event polling. |
| skills/copilot-pr-autopilot/scripts/02-check-review-status.ps1 | Produces a JSON snapshot of PR review state + convergence signals for polling logic. |
| skills/copilot-pr-autopilot/scripts/03-list-open-threads.ps1 | Lists unresolved review threads (all reviewers) for downstream triage. |
| skills/copilot-pr-autopilot/scripts/08-reply-and-resolve.ps1 | Posts a reply to a review thread and optionally resolves it. |
| skills/copilot-pr-autopilot/scripts/10-cleanup-outdated.ps1 | Bulk-resolves outdated Copilot threads as a post-convergence safety net. |
| skills/copilot-pr-autopilot/references/workflow.md | Documents the loop, delegation map, budgets, and operational contracts. |
| skills/copilot-pr-autopilot/references/api-quirks.md | Captures verified GitHub API/gh quirks that drive implementation choices. |
| skills/copilot-pr-autopilot/references/03-triage-criteria.md | Defines fix/decline/escalate rubric and reviewer-type policy. |
| skills/copilot-pr-autopilot/references/06-reply-templates.md | Provides reply templates and anti-patterns for thread responses. |
| skills/copilot-pr-autopilot/SKILL.md | Defines the skill, prerequisites, workflow, and troubleshooting guidance. |
| docs/README.skills.md | Adds the skill entry and links to its references/scripts. |
c7b2a60 to
b73d67f
Compare
3d3642a to
04067ae
Compare
…ard comment
The previous comment claimed `@($data.errors).Count -gt 0` was robust to
`.errors` being an empty hashtable. That's not accurate: `@(@{}).Count`
is 1 in PowerShell (Array subexpression wraps the hashtable as a single
element, regardless of its key count), so the guard would incorrectly
fire on an `errors: {}` response.
In practice this is unreachable because the GraphQL spec
(https://spec.graphql.org/October2021/#sec-Errors) requires `errors`
to be an array when present, and gh CLI surfaces the raw response. The
guard is correct for every real-world response shape — but the comment
overstated its robustness. Now reads accurately: cites the spec
guarantee, explains the explicit Count check is for reader clarity not
PowerShell-empty-array-truthiness defense, and notes the hashtable
edge case explicitly.
Pushback (documented in commit not on threads):
- Three case-sensitivity findings on `-eq`/`-ne` for GitHub logins are
false positives. PowerShell's `-eq` operator is case-INsensitive by
default; `-ceq` is the case-sensitive variant. Empirical:
PS> 'Copilot' -eq 'copilot' # True
PS> 'Copilot' -ceq 'copilot' # False
PS> 'Copilot' -ieq 'copilot' # True
`-ieq` is semantically identical to `-eq`. No code change needed for
GitHub login case handling.
Addresses Copilot Code Review round 50 (4 threads: 1 fixed, 3 pushed
back as PowerShell-operator false positives).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…t-GhReady error
The previous prereq-error here-string used `\$env:GH_HOST` (backslash
escape), but PowerShell escapes with backtick (`) — backslash is not
an escape character in PS strings. As a result, the rendered message
included the current expanded value of $env:GH_HOST prefixed by a
literal `\`, instead of the intended literal text `$env:GH_HOST`. (For
most users $env:GH_HOST is empty, so the rendered text was `\` — a
near-invisible failure of the documentation intent.)
Switched the error-message construction to a single-quoted here-string
with `{HOST}` and `{ERR}` placeholders that are .Replace()'d. This
form is parse-stable, copy/pasteable, and renders the literal
`$env:GH_HOST` text correctly regardless of any caller environment.
Smoke-verified with $env:GH_HOST=ghe.nonexistent.example.com — the
output now shows the literal `$env:GH_HOST` token where the doc
intends to reference it.
Addresses Copilot Code Review round 51 (1 thread, real, fixed).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…iteration
In 10-cleanup-outdated.ps1, the allComments-author iteration at line
~371 dereferenced `$n.author` without first null-guarding `$n` itself.
GraphQL list elements are nullable by default unless declared NonNull
in the schema; a null node in `allComments.nodes` would NPE here.
Aligned with the same defensive pattern already used at lines 192 and
266 (Get-AllThreadAuthors collection paths):
$login = if ($n -and $n.author) { $n.author.login } else { $null }
Null node now falls through to `$login = $null`, which the downstream
`if (-not $login) { $unknownAuthorInThread = $true }` branch already
treats as unsafe (fail-safe — refuse to auto-resolve a thread whose
authorship is ambiguous).
Pushback (documented in commit not on threads):
- The duplicate 01-request-review.ps1 findings about re-implementing
GraphQL errors[] handling are architectural and intentional. 01
needs to introspect `errors` and branch BEFORE throwing, to produce
the actionable "external PR author cannot trigger via public API"
message (which depends on `viewerIsAuthor` AND the FORBIDDEN signal).
Invoke-GhGraphQL throws on errors; centralizing 01's branching
logic into _lib.ps1 would either (a) require a parallel
Invoke-GhGraphQL-NoThrow variant + caller-side branching (same
duplication, different shape) or (b) bake 01-specific contextual
messages into _lib.ps1 (worse cohesion). Keeping the per-script
branching for now; if more scripts need pre-throw FORBIDDEN
branching we'll factor (a) at that time.
Addresses Copilot Code Review round 52 (3 threads: 1 fixed, 2 pushed
back as duplicate architectural-refactor request).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…arify ArgumentList comment - _lib.ps1: Added `$CopilotPrAutopilot_CopilotReviewerSlug` (and back-compat alias `$CopilotReviewerSlug`), parallel to the existing regex centralization. Same namespacing + ReadOnly + dot-source-safe pattern. Holds the canonical `copilot-pull-request-reviewer` slug used by GraphQL `requestReviewsByLogin` and REST `requested_reviewers` payloads. - 01-request-review.ps1: The inlined `botLogins:["copilot-pull-request-reviewer"]` is now a GraphQL `$slug:String!` variable passed via gh -f, sourced from the centralized `$CopilotReviewerSlug`. If GitHub ever renames the canonical bot slug, the change is one line in _lib.ps1. - _lib.ps1 header: Reworded the ArgumentList compatibility comment to state "not available on .NET Framework" instead of the misleading "returns $null on .NET Framework" — the property doesn't exist on .NET Framework, it's not a null-returning property. Addresses Copilot Code Review round 53 (2 threads, both real, both fixed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…try/finally; localize EAP around mkdir The Invoke-Gh argument-rewrite loop creates per-call temp directories and writes PR-content bodies BEFORE entering the try/finally that performs cleanup. If an exception was thrown during the rewrite (e.g., `mkdir -m 700` failed, `[IO.File]::WriteAllText` failed, the "POSIX mkdir not found" throw fired), the function would exit with accumulated $tempFiles/$tempDirs left on disk — leaking PR content. Restructured: ONE outer try block now wraps both the rewrite loop AND the native `gh` invocation. The outer finally cleans up $errFile (null-guarded since it may not exist if we threw during rewrite), $tempFiles, and $tempDirs. Any throw in rewrite or in the native call now triggers the same deterministic cleanup of whatever was accumulated. Also localized $ErrorActionPreference around the external `mkdir -m 700` call (was previously bare native invocation). Same defensive pattern already applied to `gh auth status` and `gh @finalArgs`: saves prev, forces 'Continue', restores in finally. Prevents a caller's $EAP='Stop' + PS 7+'s $PSNativeCommandUseErrorActionPreference from turning mkdir stderr into a NativeCommandError that would bypass the structured $LASTEXITCODE handling and skip the actionable "refusing to send PR content via a world-readable temp dir" message. Smoke-tested with 02-check-review-status.ps1 against the live PR — single iteration succeeds and returns the expected JSON. Addresses Copilot Code Review round 55 (5 threads: 4 dupes of the same rewrite-loop-not-in-try finding, all fixed by the restructure; 1 EAP-around-mkdir finding fixed by the EAP localization). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… empty-nodes fail-safe in 10 - Assert-GhReady now writes the normalized $targetHostLookup back to $env:GH_HOST after validation (only when different from the raw value). Every subsequent `gh` call in the process — including ones that DON'T pass --hostname (e.g., `gh api graphql ...`) — now reads the same validated bare-host form. Without this, a user setting `$env:GH_HOST = 'https://ghe.example.com/'` would pass the auth check (the URL form got normalized before --hostname) but every subsequent `gh api` call would still read the raw URL form and either fail unpredictably or hit a different/invalid host than the one Assert-GhReady just validated. Smoke: GH_HOST='https://github.com/' before Assert-GhReady → GH_HOST='github.com' after Assert-GhReady. - 10-cleanup-outdated.ps1 missing-authorship guard now also treats an EMPTY `allComments.nodes` array as malformed (skip with `$skippedMissingAuthorship`). Every valid thread has at least one comment (the one that opened it), so empty nodes after a successful query means malformed response — must not vacuously pass the human-in-thread guard and silently resolve the thread without verifying authorship. Addresses Copilot Code Review round 56 (2 threads, both real, both fixed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…OS BSD compat)
BSD `mkdir` on macOS does not portably accept the `--` end-of-options
marker the way GNU coreutils does — it can error with "illegal option
-- -". Since this code path explicitly targets `$IsMacOS`, the `--`
would risk breaking temp dir creation (and therefore every Invoke-Gh
call that triggers the tempfile rewrite) on macOS.
`--` was originally added defensively to ensure `$tdir` couldn't be
interpreted as an option, but our `$tdir` is constructed as:
Join-Path ([IO.Path]::GetTempPath()) ("copilot-pr-autopilot-" + Guid)
so it always begins with the absolute temp path + a literal prefix +
a GUID. It can never start with `-`, so the `--` separator is
unnecessary even on GNU coreutils. Dropping it for cross-platform
compatibility.
Addresses Copilot Code Review round 57 (1 thread, real, fixed).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… resolveReviewThread response; soften Windows ACL comment
- Format-IsoUtcString now passes [CultureInfo]::InvariantCulture to
`DateTime.ToString(format, IFormatProvider)`. The default
`DateTime.ToString(format)` uses the current culture, which on
Thai/Arabic locales would emit non-Western numerals into the JSON
string (or apply unexpected separators). Locale-stability is the
whole point of this helper.
Smoke: same input emitted identical output under InvariantCulture
and th-TH culture context.
- 10-cleanup-outdated.ps1: The resolveReviewThread mutation selects
`thread { isResolved }` — now actually inspect that field instead
of discarding the response. A structurally-successful response that
didn't flip `isResolved` to true (partial failure, edge case where
another actor already resolved, future schema drift) was being
reported as success. Now treated as a thrown failure with an
actionable observed-value in the message, added to `$failed`.
- _lib.ps1: Softened the "per-user %TEMP% ACL is already owner-only"
comment. Per-user %TEMP% (interactive logon default) IS
owner-restricted, but this is NOT guaranteed in every environment
— service accounts may use a shared TEMP and local admins always
have read access. The comment now documents that assumption
explicitly instead of overclaiming.
Addresses Copilot Code Review round 59 (3 threads, all real, all fixed).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Mirror the verification pattern already used by 10-cleanup-outdated.ps1 so a structurally-successful GraphQL response that did not actually flip isResolved is surfaced as a throw instead of a misleading 'Resolved thread' success message. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nsistent first page If the allComments connection reports totalCount greater than the number of nodes returned but pageInfo.hasNextPage is false (API edge case / partial response), force the node(id:) pagination walk anyway. Without this guard a partially-truncated first page could vacuously pass the human-in-thread check and auto-resolve a thread that actually contains human participation we did not see. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI codespell check was failing on two comment-only typos. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… formatting - Invoke-Gh now calls Assert-GhReady (idempotent + memoized) at entry, making the dependency local to the function instead of relying on dot-source ordering. - 08 resolveReviewThread verification now formats the observed value with explicit branches for missing-node / null cases so the error is actionable even on malformed responses. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds
skills/copilot-pr-autopilot/— runs the Copilot Code Review → fix → reply+resolve → re-trigger loop on a PR until HEAD is reviewed with zero open Copilot threads.Repo-agnostic (discovers build/test/lint from
CONTRIBUTING/AGENTS/package.json/Makefile). Bundles 10 PowerShell scripts (one per step) + per-step reference docs.Permissions: the full multi-round autopilot needs repo Triage or Write (GitHub's
requestReviewsByLoginis the only public API for bot reviewers and is gated on that). External PR authors get a documented-SingleIterationmode plus manual re-trigger between iterations (UI 🔄 button orsynchronize-event push). SeeSKILL.md → Prerequisites → Permissionsfor the full matrix.Validation:
npm run skill:validate✅,npm run build✅,bash eng/fix-line-endings.sh✅.