fix(workflows): minimise permissions + clarify ci-status first-run log by lml2468 · Pull Request #5 · Mininglamp-OSS/.github

lml2468 · 2026-05-13T14:59:01Z

Batch A — Security: GITHUB_TOKEN permission hardening

Changes

Commit 1 — permissions: {} on 3 reusable workflows

octo-issue-feed.yml: only uses OCTO_BOT_TOKEN; GITHUB_TOKEN needs zero permissions
octo-pr-feed.yml: same
pr-merged-done.yml: only uses PROJECT_TOKEN PAT; same

Commit 2 — octo-ci-status.yml two fixes

permissions was already actions: read (correct); no change needed there
Added explicit prev_conclusion is None branch: prints "First run detected (no previous history), skipping notification" instead of falling into the confusing "Unhandled transition None → success" message

Why

All four workflows are org-level reusable infrastructure called from 8+ repos. Leaving GITHUB_TOKEN permissions at default (or undeclared) violates least-privilege and creates unnecessary write surface.

Part of

Batch A of the org-level workflow hardening series. Batch B (urllib timeout + retry) follows next.

…ion message

… workflows

yujiawei

Code Review — PR #5 (.github)

Summary

The intent of this PR is good — minimising GITHUB_TOKEN permissions on four reusable workflows, and clarifying the confusing "Unhandled transition None → success" log on the first run of octo-ci-status. The four permissions: declarations land correctly. However, the octo-ci-status.yml Python change is structurally broken and silently disables every CI notification this workflow is supposed to send. As-is this PR cannot ship.

P0 — Blocker

`octo-ci-status.yml`: new `if prev_conclusion is None:` block breaks all notifications

.github/workflows/octo-ci-status.yml lines ~109–127 (head ec5510c):

# Determine message
if curr_conclusion == 'failure':
    msg = (
        f'❌ [{repo}] main CI 挂了\n\n'
        f'工作流：{wf_name}\n'
        f'🔗 {run_url}'
    )
elif curr_conclusion == 'success' and prev_conclusion == 'failure':
    msg = (
        f'✅ [{repo}] main CI 已恢复\n\n'
        f'工作流：{wf_name}\n'
        f'🔗 {run_url}'
    )
if prev_conclusion is None:                                       # ← NEW
    print(f'First run detected (no previous history), skipping notification.')
    sys.exit(0)
else:                                                              # ← now binds to the NEW if
    print(f'Unhandled transition {prev_conclusion!r} → {curr_conclusion!r}, skipping.')
    sys.exit(0)

# Send to Octo IM
...
send('4ade985d984e432eb7fbdd0ad4f8118a', msg)
send(proj_gid, msg)

The new if prev_conclusion is None: was inserted at the same indentation as the leading if curr_conclusion == 'failure':, so Python parses it as a separate top‑level statement rather than an elif on the existing chain. The trailing else: is therefore the else of the new if, not the else of the original if/elif/else chain. Every path now reaches a sys.exit(0) before send() is ever called.

Concretely:

Scenario	`prev_conclusion`	`curr_conclusion`	Old behaviour	New behaviour
Failure alert	`'success'`	`'failure'`	Sends ❌ alert	Logs "Unhandled transition 'success' → 'failure'", exits, NO alert
Recovery alert	`'failure'`	`'success'`	Sends ✅ recovery	Logs "Unhandled transition 'failure' → 'success'", exits, NO recovery
First run (any)	`None`	anything	Either alerts on first failure or hits old "Unhandled transition" log	Always exits silently with "First run detected"
Success → success	`'success'`	`'success'`	Already returned earlier on `curr == prev` check	Same

The two notification branches that this workflow exists to serve — failure alerts and recovery notifications — are both dead code now. This is an org-level reusable workflow consumed by 8+ repos; merging this branch silences the CI failure alerting pipeline everywhere it is wired up.

Suggested fix — make the first‑run check part of the same dispatch chain, e.g. as an early guard:

if prev_conclusion is None:
    print('First run detected (no previous history), skipping notification.')
    sys.exit(0)

if curr_conclusion == 'failure':
    msg = (...)
elif curr_conclusion == 'success' and prev_conclusion == 'failure':
    msg = (...)
else:
    print(f'Unhandled transition {prev_conclusion!r} → {curr_conclusion!r}, skipping.')
    sys.exit(0)

# Send to Octo IM
...

or as a sibling elif inside the existing chain:

if curr_conclusion == 'failure':
    msg = (...)
elif curr_conclusion == 'success' and prev_conclusion == 'failure':
    msg = (...)
elif prev_conclusion is None:
    print('First run detected (no previous history), skipping notification.')
    sys.exit(0)
else:
    print(f'Unhandled transition {prev_conclusion!r} → {curr_conclusion!r}, skipping.')
    sys.exit(0)

Either form preserves the original semantics for the two alert paths and only re-routes the previously confusing None → success (and None → failure, if that is also intended) case. Please add a quick unit test or a temporary print(curr_conclusion, prev_conclusion) smoke run on a test caller before re-pushing — the failure mode here is silent.

A brief test plan that would have caught this:

Trigger the caller workflow on a repo where there is no recent history → expect First run detected.
Force a failing run on main after a green run → expect ❌ alert in both groups.
Force a green run after a failure → expect ✅ recovery in both groups.

P2 — Nits

octo-ci-status.yml, new line: print(f'First run detected (no previous history), skipping notification.') — f'' has no interpolation; plain string is fine. Minor.

Permissions hardening — verified

The permissions: {} and permissions: { actions: read } declarations all match the actual token usage in the four workflows; this part is correct.

File	Declared	GITHUB_TOKEN usage in body	Verdict
`octo-issue-feed.yml`	`permissions: {}`	None — only `OCTO_BOT_TOKEN` is used.	✅
`octo-pr-feed.yml`	`permissions: {}`	None — only `OCTO_BOT_TOKEN` is used.	✅
`pr-merged-done.yml`	`permissions: {}`	`actions/github-script@v7` is invoked with `github-token: ${{ secrets.PROJECT_TOKEN }}`; no implicit `GITHUB_TOKEN` calls.	✅
`octo-ci-status.yml`	`permissions: { actions: read }`	Calls `GET /repos/{owner}/{repo}/actions/runs` with `secrets.GITHUB_TOKEN`; `actions: read` is the minimum scope.	✅

For org-level reusable workflows callable from many repos, declaring permissions: explicitly (instead of relying on the org/repo default) is the right move.

Verdict

Changes requested. The permissions hardening is good and ready to merge once split out, but the octo-ci-status.yml first-run-log change must be reworked: as written, the new if prev_conclusion is None block silently disables both CI failure alerts and recovery notifications across every consumer of this reusable workflow. Suggest hoisting the first‑run guard above the if/elif dispatch (or making it an elif) and adding a smoke test before re-pushing.

…re alert paths

lml2468 · 2026-05-13T15:51:51Z

Addressed P0 blocker (commit 3fb40fb):

Root cause confirmed: the if prev_conclusion is None: block was inserted as a separate top-level statement after the if/elif message-build chain, causing the else: to bind to the new if instead of the original chain. Every run — including failure alerts and recovery notifications — hit a sys.exit(0) before send() was ever reached.

Fix: hoisted the first-run guard to before the if/elif dispatch as an early-exit guard:

# Guard: first-ever run has no previous history — skip silently
if prev_conclusion is None:
    print('First run detected (no previous history), skipping notification.')
    sys.exit(0)

# Determine message
if curr_conclusion == 'failure':
    ...
elif curr_conclusion == 'success' and prev_conclusion == 'failure':
    ...
else:
    print(f'Unhandled transition ...')
    sys.exit(0)

Also addressed the P2 nit: dropped the redundant f-prefix from the plain string.

Ready for re-review.

yujiawei

Code Review — PR #5 (.github)

Summary

Small, focused, low-risk batch:

permissions: {} added to three reusable workflows that never use GITHUB_TOKEN.
permissions: actions: read added to octo-ci-status.yml (the only one that does call the GitHub Actions API).
A new explicit prev_conclusion is None guard in octo-ci-status.yml that replaces the confusing Unhandled transition None → success log line.

The security tightening is correct and the log fix is reasonable. Approving with two small notes the author should be aware of.

Verification

Item	Result	Evidence
`octo-issue-feed.yml` only consumes `OCTO_BOT_TOKEN` (PAT), not `GITHUB_TOKEN`	✅	env block at lines ~44 only wires `OCTO_BOT_TOKEN`; Python uses `os.environ['OCTO_BOT_TOKEN']`
`octo-pr-feed.yml` same	✅	same shape as issue-feed
`pr-merged-done.yml` only consumes `PROJECT_TOKEN` (PAT)	✅	step uses `github-token: ${{ secrets.PROJECT_TOKEN }}`
`octo-ci-status.yml` actually needs `actions: read`	✅	calls `GET /repos/{owner}/{repo}/actions/runs` with the default `GITHUB_TOKEN`; that endpoint requires `actions: read`
First-run guard correctness for the "skip silently" intent	✅	placed after the `curr == prev` short-circuit and before message construction; covers both first-success and first-failure cases

For a reusable workflow, the workflow-level permissions: block is the upper bound on the GITHUB_TOKEN granted inside it — the caller cannot grant more. Since these three workflows do not touch GITHUB_TOKEN at all, permissions: {} is the correct minimum.

Notes (non-blocking)

N1. PR description vs. actual diff for `octo-ci-status.yml` permissions

The PR body says:

permissions was already actions: read (correct); no change needed there

But the diff in fact adds the permissions: actions: read block — main had no top-level permissions declaration at all (verified by reading the file at main). The change is correct and necessary; only the description is misleading. Worth fixing in the commit message / PR body for the audit trail, but no code change needed.

N2. "First run detected" message can be inaccurate in busy repos

The previous-run lookup pulls the 10 most recent completed runs on main across all workflows, then filters by workflow name:

api_url = (
    f'https://api.github.com/repos/Mininglamp-OSS/{repo}/actions/runs'
    f'?branch=main&status=completed&per_page=10'
)
...
same_wf = [r for r in runs if r['name'] == wf_name]

(octo-ci-status.yml lines ~62-78, unchanged by this PR)

In a repo with several workflows running on every push, a low-frequency workflow can easily have no prior same-name run in the recent 10-run window even after running many times historically. In that case previous is None and this PR's new message will print "First run detected (no previous history)" — which is not actually true, and which now also silently swallows what would previously have been a failure alert (the old code would have hit the if curr_conclusion == 'failure': branch and notified).

So this is a small behavior change beyond what the PR body advertises:

Old behavior on prev=None, curr='failure': sends a failure alert (arguably noisy on a genuine first run, but also catches the "lost in window" case).
New behavior on prev=None, curr='failure': silent.

For the genuinely-first-ever-run case this is an improvement. For the "fell out of the 10-run window" case it's a regression in alerting fidelity.

Suggested follow-up (not required for this PR):

Bump per_page to 100, or
Pre-filter on the GitHub side with workflow_id (cheaper and exact), e.g.
GET /repos/{owner}/{repo}/actions/workflows/{workflow_id}/runs?branch=main&status=completed&per_page=10,
which removes the cross-workflow-window confusion entirely.

This is pre-existing behavior; calling it out only because the PR is touching exactly this code path and it would be a clean follow-up.

Verdict

Approving. Both notes are non-blocking — N1 is a description-only fix, N2 is a follow-up worth tracking but not a regression introduced by this PR's intent.

…group_id validation - octo-issue-feed.yml: sanitize each label via sanitize_text(l, max_len=64) to prevent newline injection in label names (Obs #2) - issue-welcome.yml: move bot-type check before paginate() to avoid unnecessary API round-trips for bot-opened issues (Obs #3) - octo-ci-status.yml: add require_group_id() + re import, replace require_env('PROJECT_GROUP_ID') with require_group_id() for consistency with issue-feed and pr-feed (Obs #5)

lml2468 added 2 commits May 13, 2026 22:55

fix(ci-status): clarify first-run skip log, replace Unhandled transit…

011be3e

…ion message

fix(workflows): minimise GITHUB_TOKEN permissions across org reusable…

ec5510c

… workflows

yujiawei requested changes May 13, 2026

View reviewed changes

lml2468 mentioned this pull request May 13, 2026

fix(workflows): add urllib timeout, try/except, job timeout-minutes to 3 notify workflows #6

Merged

fix(ci-status): hoist first-run guard above if/elif dispatch to resto…

3fb40fb

…re alert paths

yujiawei approved these changes May 13, 2026

View reviewed changes

lml2468 merged commit fd26bd7 into main May 14, 2026
1 check passed

lml2468 mentioned this pull request May 16, 2026

security: harden workflow security - API allowlist, input validation, idempotency #19

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(workflows): minimise permissions + clarify ci-status first-run log#5

fix(workflows): minimise permissions + clarify ci-status first-run log#5
lml2468 merged 3 commits into
mainfrom
fix/security-permissions

lml2468 commented May 13, 2026

Uh oh!

yujiawei left a comment

Uh oh!

lml2468 commented May 13, 2026

Uh oh!

yujiawei left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lml2468 commented May 13, 2026

Batch A — Security: GITHUB_TOKEN permission hardening

Changes

Why

Part of

Uh oh!

yujiawei left a comment

Choose a reason for hiding this comment

Code Review — PR #5 (.github)

Summary

P0 — Blocker

octo-ci-status.yml: new if prev_conclusion is None: block breaks all notifications

P2 — Nits

Permissions hardening — verified

Verdict

Uh oh!

lml2468 commented May 13, 2026

Uh oh!

yujiawei left a comment

Choose a reason for hiding this comment

Code Review — PR #5 (.github)

Summary

Verification

Notes (non-blocking)

N1. PR description vs. actual diff for octo-ci-status.yml permissions

N2. "First run detected" message can be inaccurate in busy repos

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`octo-ci-status.yml`: new `if prev_conclusion is None:` block breaks all notifications

N1. PR description vs. actual diff for `octo-ci-status.yml` permissions