Skip to content

Commit 4f22957

Browse files
authored
Improve commit message suggestion workflow (#2000)
Summary of changes: - Defer `git` command execution to the agent, and guide it to execute only relevant commands. - Suggest a better command to retrieve past commit upgrade messages. - Enable limited network access, so that Github milestone and release candidate versions can be retrieved. - Improve the instructions for the inclusion of custom release note links. - Attach the suggested commit message as a workflow artifact, and provide a script to retrieve it. This enables command line-based development iterations.
1 parent 5363559 commit 4f22957

File tree

2 files changed

+122
-86
lines changed

2 files changed

+122
-86
lines changed

.github/workflows/suggest-commit-message.yml

Lines changed: 93 additions & 86 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,12 @@ permissions:
1717
concurrency:
1818
group: suggest-commit-message-${{ github.event.pull_request.number || github.event.inputs.pr_number }}
1919
cancel-in-progress: true
20+
env:
21+
ALLOWED_ENDPOINTS: >
22+
api.github.com:443
23+
api.openai.com:443
24+
github.com:443
25+
registry.npmjs.org:443
2026
jobs:
2127
suggest:
2228
permissions:
@@ -29,14 +35,13 @@ jobs:
2935
uses: step-security/harden-runner@95d9a5deda9de15063e7595e9719c11c38c90ae2 # v2.13.2
3036
with:
3137
# We can't disable `sudo`, as `openai/codex-action` unconditionally
32-
# invokes `sudo`. That step does disable `sudo` for itself and
33-
# subsequent steps.
34-
# XXX: Consider splitting this workflow into two jobs, with
35-
# `openai/codex-action` being the first step of the second job.
38+
# invokes `sudo`, even with `safety-strategy: unsafe` and
39+
# `sandbox: danger-full-access`.
40+
# XXX: Consider splitting this workflow into three jobs, with
41+
# `openai/codex-action` being the sole step of the second job.
3642
disable-sudo-and-containers: false
37-
# XXX: Change to `egress-policy: block` once we better understand
38-
# whether Codex attempts to access arbitrary URLs.
39-
egress-policy: audit
43+
egress-policy: block
44+
allowed-endpoints: ${{ env.ALLOWED_ENDPOINTS }}
4045
- name: Resolve pull request metadata
4146
id: pr-details
4247
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8.0.0
@@ -82,24 +87,8 @@ jobs:
8287
HEAD_SHA: ${{ steps.pr-details.outputs.headSha }}
8388
with:
8489
script: |
85-
const { execFileSync } = require('child_process');
8690
const fs = require('fs');
8791
88-
const git = (args, limit) => {
89-
const output = execFileSync('git', args, { encoding: 'utf8' }).trim();
90-
if (!limit || !output) {
91-
return output;
92-
}
93-
94-
const lines = output.split(/\r?\n/);
95-
if (lines.length <= limit) {
96-
return output;
97-
}
98-
99-
const truncated = lines.slice(0, limit).join('\n');
100-
return `${truncated}\n... (${limit} of ${lines.length} lines shown)`;
101-
};
102-
10392
const env = process.env;
10493
const repository = env.REPOSITORY;
10594
const prNumber = env.PR_NUMBER;
@@ -111,19 +100,30 @@ jobs:
111100
const headRef = env.HEAD_REF;
112101
const headSha = env.HEAD_SHA;
113102
114-
const diffStat = git(['diff', '--name-status', `${baseSha}...${headSha}`]) || '<no changed files>';
115-
const diffExcerpt = git(['diff', '--unified=3', `${baseSha}...${headSha}`], 500) || '<no diff>';
116-
const nonUpgradeCommits =
117-
git(['log', '--grep', '^Upgrade', '--invert-grep', '--pretty=format:%h %B%n---', '-n', '50', baseSha]) ||
118-
'<no non-upgrade commits found>';
119-
const upgradeCommits =
120-
git(['log', '--grep', '^Upgrade', '--pretty=format:%h %B%n---', '-n', '150', baseSha]) ||
121-
'<no upgrade commits found>';
122-
123103
const cleanedBody = (body || '').trim() || '<no pull request description>';
124104
105+
// Determine whether this is an upgrade PR.
106+
const upgradeMatch = title?.match(/^Upgrade (.+?) \S+ -> \S+/);
107+
const upgradeLibrary = upgradeMatch ? upgradeMatch[1] : null;
108+
109+
// Extract domain names from list of endpoints to which Harden-Runner will allow access.
110+
const allowedDomains = process.env.ALLOWED_ENDPOINTS.split(/\s+/).map(line => line.split(':')[0]).filter(Boolean);
111+
125112
const instructions = `
126-
You are an experienced maintainer helping to craft the squash commit message for PR #${prNumber} in the ${repository} repository.
113+
You are an experienced maintainer helping to craft the squash commit message for a GitHub pull request.
114+
115+
Pull request metadata:
116+
- Repository: ${repository}
117+
- Number: ${prNumber}
118+
- Title: ${title}
119+
- Author: ${author}
120+
- Base branch: ${baseRef} (${baseSha})
121+
- Head branch: ${headRef} (${headSha})
122+
123+
Pull request description:
124+
\`\`\`
125+
${cleanedBody}
126+
\`\`\`
127127
128128
Requirements:
129129
1. Write the summary line in the imperative mood. Try not to exceed 80 characters.
@@ -137,19 +137,43 @@ jobs:
137137
9. Never split URLs across multiple lines, even if they exceed 72 characters.
138138
10. If the pull request description already contains a suitable commit message, prefer using that as-is.
139139
140+
To help you craft an appropriate commit message, execute the following commands to gather context:
141+
142+
1. Get the changed files:
143+
\`\`\`
144+
git diff --name-status ${baseSha}...${headSha}
145+
\`\`\`
146+
147+
2. Get a diff excerpt (first 500 lines):
148+
\`\`\`
149+
git diff ${baseSha}...${headSha} | head -500
150+
\`\`\`
151+
152+
${upgradeLibrary ? `3. Since this appears to be an upgrade PR for ${upgradeLibrary}, collect relevant past upgrade commit messages, and consider the general style of other recent upgrade commit messages:
153+
\`\`\`
154+
git log -P -i --grep '^Upgrade \\Q${upgradeLibrary}\\E' --pretty='format:%h %B%n---' -n 20 ${baseSha}
155+
git log -P -i --grep '^Upgrade (?!\\Q${upgradeLibrary}\\E)' --pretty='format:%h %B%n---' -n 150 ${baseSha}
156+
\`\`\`
157+
158+
4. If this is a GitHub-hosted library, collect milestones and release candidates that may not be included in the changelog:
159+
\`\`\`
160+
curl -s "https://api.github.com/repos/{owner}/{repo}/releases?per_page=100" | jq -r '.[].tag_name' | sort -h | tail -n 50
161+
\`\`\`` : `3. Get examples of recent non-upgrade commits:
162+
\`\`\`
163+
git log --grep '^Upgrade' --invert-grep --pretty='format:%h %B%n---' -n 50 ${baseSha}
164+
\`\`\``}
165+
140166
Some further guidelines to help you craft good upgrade commit messages:
141167
- Unless highly salient, don't summarize code changes made as part of the upgrade.
142168
- Don't bother linking to anchors within changelogs or release notes; just link to the main page.
143-
- For GitHub-hosted projects, always link to all relevant GitHub release pages, including those for intermediate versions.
144-
- This includes milestones and release candidates; if necessary, use the GitHub API to identify these.
145-
- Libraries that often use milestone and release candidates include, but are not limited to:
146-
- Jackson
147-
- JUnit
148-
- Micrometer
149-
- Project Reactor
150-
- Spring Framework
151-
- Spring Boot
152-
- Spring Security
169+
- For GitHub-hosted projects, always link to all relevant GitHub release pages, including those for milestones, release candidates and other intermediate versions. This is especially important for major and minor version upgrades of the following libraries:
170+
- Jackson
171+
- JUnit
172+
- Micrometer
173+
- Project Reactor
174+
- Spring Framework
175+
- Spring Boot
176+
- Spring Security
153177
- For GitHub-hosted projects, always link to the full diff between versions.
154178
- Enumerate links in the following order:
155179
1. First, link to custom release note documents.
@@ -159,12 +183,15 @@ jobs:
159183
- When the Maven \u0060version.error-prone-orig\u0060 property is changed, this upgrades both Error Prone and Picnic's Error Prone fork. In this case:
160184
- Make sure that the commit message includes a diff URL for the latter.
161185
- Don't explicitly mention that \u0060version.error-prone-orig\u0060 got changed; just focus on the fact that Error Prone is being upgraded.
162-
- If the example upgrade commits shown below don't include at least one upgrade of the same dependency being upgraded in this pull request, check the full Git history to find relevant past upgrade commit messages to mimic.
163186
- For major and minor version upgrades, check past dependency upgrade commit messages to infer documentation, blog or wiki URLs to which to link. Do this for at least the following libraries:
164-
- Jackson
165-
- Spring Framework
166-
- Spring Boot
167-
- Spring Security
187+
- Jackson: https://github.com/FasterXML/jackson/wiki/Jackson-Release-{version}
188+
- Spring Framework: https://github.com/spring-projects/spring-framework/wiki/Spring-Framework-{version}-Release-Notes
189+
- Spring Boot: https://github.com/spring-projects/spring-boot/wiki/Spring-Boot-{version}-Release-Notes
190+
- Spring Security: https://docs.spring.io/spring-security/reference/{version}/whats-new.html
191+
- If you really can't find relevant URLs to reference, and there's nothing else to say, it's acceptable to have a commit message that only consists of the summary line.
192+
193+
Note that your network access is limited to the following domains; don't attempt \`curl\` or \`wget\` commands to other hosts:
194+
${allowedDomains.map(domain => `- ${domain}`).join('\n')}
168195
169196
Return a JSON object with the following shape:
170197
{
@@ -173,38 +200,6 @@ jobs:
173200
}
174201
175202
Ensure the JSON is valid. Do not include additional commentary outside the JSON structure.
176-
177-
Pull request metadata:
178-
- Number: ${prNumber}
179-
- Title: ${title}
180-
- Author: ${author}
181-
- Base branch: ${baseRef} (${baseSha})
182-
- Head branch: ${headRef} (${headSha})
183-
184-
Pull request description:
185-
\u0060\u0060\u0060
186-
${cleanedBody}
187-
\u0060\u0060\u0060
188-
189-
Changed files (\u0060git diff --name-status ${baseSha}...${headSha}\u0060):
190-
\u0060\u0060\u0060
191-
${diffStat}
192-
\u0060\u0060\u0060
193-
194-
Diff excerpt (\u0060git diff --unified=3 ${baseSha}...${headSha}\u0060, truncated to 500 lines if necessary):
195-
\u0060\u0060\u0060
196-
${diffExcerpt}
197-
\u0060\u0060\u0060
198-
199-
Recent non-upgrade commits examples (\u0060git log --grep '^Upgrade' --invert-grep --pretty='format:%h %B%n---' -n 50\u0060):
200-
\u0060\u0060\u0060
201-
${nonUpgradeCommits}
202-
\u0060\u0060\u0060
203-
204-
Recent upgrade commit examples (\u0060git log --grep '^Upgrade' --pretty='format:%h %B%n---' -n 150\u0060):
205-
\u0060\u0060\u0060
206-
${upgradeCommits}
207-
\u0060\u0060\u0060
208203
`;
209204
210205
const promptPath = '/tmp/codex-prompt-suggest-commit-message.md';
@@ -213,11 +208,12 @@ jobs:
213208
id: codex
214209
uses: openai/codex-action@086169432f1d2ab2f4057540b1754d550f6a1189 # v1.4
215210
with:
216-
# XXX: Consider whether to set `safety-strategy: read-only`. In some
217-
# cases the agent may be able to suggest a better commit message by
218-
# following links or otherwise looking up information online. See
219-
# also the `egress-policy` discussion further up.
220-
sandbox: read-only
211+
# XXX: We're using `safety-strategy: unsafe` and
212+
# `sandbox: danger-full-access` so that the agent is able to access
213+
# the network and look up e.g. GitHub release tags. Some amount of
214+
# safety is provided by the Harden-Runner step further up.
215+
safety-strategy: unsafe
216+
sandbox: danger-full-access
221217
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
222218
prompt-file: /tmp/codex-prompt-suggest-commit-message.md
223219
output-schema: |
@@ -244,16 +240,22 @@ jobs:
244240
with:
245241
github-token: ${{ secrets.GITHUB_TOKEN }}
246242
script: |
243+
const fs = require('fs');
244+
247245
const prNumber = process.env.PR_NUMBER;
248246
const codexResult = JSON.parse(process.env.CODEX_RESULT);
249247
250248
const summary = codexResult.summary.trim();
251249
const body = codexResult.body.trim();
252250
const commitMessage = body ? `${summary}\n\n${body}` : summary;
253251
252+
// Write the commit message to a file, so that the next step can
253+
// attach it as a workflow artifact for debug purposes.
254+
fs.writeFileSync('/tmp/suggested-commit-message.txt', commitMessage, { encoding: 'utf8' });
255+
254256
// The comment to be upserted includes a hidden marker to identify it.
255257
const marker = '<!-- codex-suggested-commit-message -->';
256-
const commentBody = `Suggested commit message:\n${marker}\n\n\u0060\u0060\u0060\n${commitMessage}\n\u0060\u0060\u0060\n`;
258+
const commentBody = `Suggested commit message:\n${marker}\n\n\`\`\`\n${commitMessage}\n\`\`\`\n`;
257259
258260
const comments = await github.paginate(github.rest.issues.listComments, {
259261
owner: context.repo.owner,
@@ -311,3 +313,8 @@ jobs:
311313
body: commentBody,
312314
});
313315
core.info(`Updated comment ${existing.id} by ${originalCommenter}.`);
316+
- name: Upload suggested commit message
317+
uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # v5.0.0
318+
with:
319+
name: suggested-commit-message
320+
path: /tmp/suggested-commit-message.txt

suggest-commit-message.sh

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
#!/usr/bin/env bash
2+
3+
# Upserts a suggested commit message comment on the specified pull request,
4+
# based on the workflow definition present on the specified branch. The
5+
# generated commit message is also written to stdout.
6+
7+
set -e -u -o pipefail
8+
9+
workflow='suggest-commit-message.yml'
10+
branch="${1:?Specify a branch}"
11+
pr_number="${2:?Specify a PR number}"
12+
13+
gh workflow run "${workflow}" --ref "${branch}" -f pr_number="${pr_number}"
14+
15+
# The new run may not start immediately, so we wait a bit.
16+
sleep 30
17+
18+
run_id=$(
19+
gh run list --workflow="${workflow}" --event workflow_dispatch --limit 1 --json databaseId \
20+
| jq -r '.[].databaseId'
21+
)
22+
23+
gh run watch "${run_id}"
24+
25+
tmp_dir="$(mktemp -d)"
26+
trap 'rm -rf -- "${tmp_dir}"' INT TERM HUP EXIT
27+
28+
gh run download -n suggested-commit-message --dir "${tmp_dir}" "${run_id}"
29+
cat "${tmp_dir}/suggested-commit-message.txt"

0 commit comments

Comments
 (0)