Skip to content

refactor(workflows): extract Python notify scripts to scripts/ directory#8

Merged
lml2468 merged 2 commits into
mainfrom
refactor/scripts-extraction
May 14, 2026
Merged

refactor(workflows): extract Python notify scripts to scripts/ directory#8
lml2468 merged 2 commits into
mainfrom
refactor/scripts-extraction

Conversation

@lml2468
Copy link
Copy Markdown
Contributor

@lml2468 lml2468 commented May 14, 2026

Summary

Batch D: Extract Python inline scripts to scripts/ directory.

This is a pure refactor — no logic changes. The three Python scripts that
were embedded as heredocs inside workflow YAML files are moved to standalone
.py files under scripts/. Each affected workflow gains a
actions/checkout step before calling python3 scripts/xxx.py.

Changes

File Change
scripts/octo_ci_status_notify.py new — extracted from octo-ci-status.yml
scripts/octo_issue_feed_notify.py new — extracted from octo-issue-feed.yml
scripts/octo_pr_feed_notify.py new — extracted from octo-pr-feed.yml
.github/workflows/octo-ci-status.yml replace heredoc with checkout + run: python3 scripts/
.github/workflows/octo-issue-feed.yml same
.github/workflows/octo-pr-feed.yml same

Why

  • Scripts can now be linted, syntax-checked, and tested locally without running a workflow
  • Diffs are cleaner — YAML indentation no longer affects Python readability
  • Easier to spot logic changes vs. workflow trigger changes in PR reviews

Compatibility

  • pr-merged-done.yml uses actions/github-script (JS) — not affected
  • issue-welcome.yml uses actions/github-script (JS) — not affected
  • auto-add-to-project.yml — not affected

Verification

python3 -m py_compile scripts/octo_ci_status_notify.py   # OK
python3 -m py_compile scripts/octo_issue_feed_notify.py  # OK
python3 -m py_compile scripts/octo_pr_feed_notify.py     # OK

Move inline Python heredoc scripts out of workflow YAML into standalone
files under scripts/ for better maintainability, local testability, and
cleaner diffs. Add actions/checkout step to each of the three affected
reusable workflows.

- scripts/octo_ci_status_notify.py (extracted from octo-ci-status.yml)
- scripts/octo_issue_feed_notify.py (extracted from octo-issue-feed.yml)
- scripts/octo_pr_feed_notify.py (extracted from octo-pr-feed.yml)

No logic changes — byte-for-byte equivalent Python code.
Copy link
Copy Markdown

@yujiawei yujiawei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #8 (.github)

Verdict

Changes requested. The refactor is well-motivated and the Python extraction is byte-for-byte equivalent, but the new actions/checkout step is wired in a way that will break every caller repo at runtime. There are two compounding P0 issues plus a couple of P1 hygiene items worth fixing in the same pass.


P0-1 — actions/checkout checks out the caller repo, not this one. Scripts won't exist at runtime.

All three modified workflows are reusable workflows (on: workflow_call), as their own headers state ("Reusable workflow… Called from per-repo caller workflows."). They are consumed externally — for example:

  • Mininglamp-OSS/octo-admin/.github/workflows/octo-ci-status.ymluses: Mininglamp-OSS/.github/.github/workflows/octo-ci-status.yml@main
  • Mininglamp-OSS/octo-server/.github/workflows/octo-ci-status.yml → same

Inside a reusable workflow, ${{ github.repository }} resolves to the caller (e.g. Mininglamp-OSS/octo-admin), not to Mininglamp-OSS/.github. actions/checkout@v4.2.2 defaults repository to ${{ github.repository }}, so the checkout will pull the caller repo into GITHUB_WORKSPACE. The caller does not have scripts/octo_ci_status_notify.py — only this repo does. Therefore the very next step:

run: python3 scripts/octo_ci_status_notify.py

…will fail with python3: can't open file '…/scripts/octo_ci_status_notify.py': [Errno 2] No such file or directory on every caller. Same failure mode for octo-issue-feed.yml and octo-pr-feed.yml.

Fix: Explicitly check out this repo at the workflow's own ref, into a sub-path, and call the script by that path. For example:

- name: Checkout reusable-workflow scripts
  uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683  # v4.2.2
  with:
    repository: Mininglamp-OSS/.github
    ref: ${{ github.workflow_sha }}   # pin to the exact reusable-workflow commit
    path: _shared
    sparse-checkout: scripts
    sparse-checkout-cone-mode: false
    persist-credentials: false

- name: Notify Octo
  run: python3 _shared/scripts/octo_ci_status_notify.py
  env: 

github.workflow_sha is the SHA of the reusable workflow file actually being executed, so the script you ship and the workflow that calls it can never go out of sync, even if a caller pins @main and main moves between the workflow load and the checkout. path: _shared keeps GITHUB_WORKSPACE clean (the caller may rely on it being empty or be running other steps that expect their own files later — though here the notify job is single-step, so this is mostly future-proofing). sparse-checkout: scripts avoids cloning the entire .github repo for a one-line invocation. persist-credentials: false is good hygiene since no later step needs the token.

Please verify the fix by running each of the three reusables from a caller branch (octo-admin or octo-server) before re-requesting review — this class of bug only shows up in the cross-repo path, not in the source repo's own CI.

P0-2 — permissions: blocks are too tight for actions/checkout to function

Once a permissions: key is declared, every scope you don't list resolves to none (GitHub Actions: "If you specify the access for any of these scopes, all of those that are not specified are set to none"). After this PR:

  • .github/workflows/octo-ci-status.yml:32 declares permissions:\n actions: read. contents is therefore none.
  • .github/workflows/octo-issue-feed.yml:38 declares permissions: {}. Every scope is none.
  • .github/workflows/octo-pr-feed.yml:46 declares permissions: {}. Same.

actions/checkout requires at minimum contents: read on the repo it's cloning. With contents: none, the GITHUB_TOKEN passed to checkout has no rights to read the repo; this fails outright on private callers and is unreliable on public ones (anonymous fallback is not guaranteed and is being phased back). Both octo-admin and octo-server appear to be private internal repos, so in practice this will hard-fail there even before P0-1 bites.

Fix: Add contents: read to all three workflows:

permissions:
  actions: read       # octo-ci-status only — needed for the GitHub API call inside the script
  contents: read      # all three — needed by actions/checkout

Note that with the P0-1 fix above (repository: Mininglamp-OSS/.github), the relevant contents: read is on Mininglamp-OSS/.github, not on the caller. The caller's GITHUB_TOKEN is what authenticates that clone, and it will only succeed if the caller's repo (or org policy) has been granted read access to Mininglamp-OSS/.github. Since Mininglamp-OSS/.github is the org community-health repo and is typically readable by all org members' tokens, this should be fine — but worth confirming once with one of the callers.


P1-1 — No ref: pinning means refactor + caller can desync

Even after fixing P0-1 with repository: Mininglamp-OSS/.github, omitting ref: would make actions/checkout pull the default branch HEAD at the moment the step runs. Callers pin the workflow with @main (or a SHA), but a separate actions/checkout would pull a separate, possibly different, main HEAD. Result: a caller pinning @<sha> could still execute a script from a newer main commit. Use ref: ${{ github.workflow_sha }} (or ${{ github.sha }} of the reusable workflow context) so the workflow file and the script it loads are guaranteed to come from the same commit.

P1-2 — Wasteful full-repo checkout for a single script

Each of the three workflows clones the entire .github repo just to run one ~50-line Python file. Use sparse-checkout: scripts (with sparse-checkout-cone-mode: false) to fetch only scripts/. Tiny perf win, but more importantly it makes the scope of the step obvious to a future reader. Already shown in the snippet above.


What's good

  • The Python content moved out of the heredocs is byte-for-byte equivalent. I diffed each removed heredoc against the new file: identical logic, identical comments, identical message strings. No behavioral risk in the extraction itself.
  • actions/checkout is pinned to a full SHA with a # v4.2.2 trailing comment — correct supply-chain hygiene, please keep that pattern.
  • Splitting the scripts into standalone .py files genuinely is the right direction: lintable, py_compile-able, locally runnable with mocked env vars, and PR diffs read much better.
  • PR title and description are accurate; the "no logic changes" claim holds up.

Suggested next steps

  1. Apply the repository: Mininglamp-OSS/.github + ref: github.workflow_sha + sparse-checkout: scripts + path: _shared pattern to all three workflows; update the run: line to python3 _shared/scripts/<file>.py.
  2. Add contents: read to all three permissions: blocks.
  3. Trigger each of the three reusable workflows from a real caller branch (octo-admin / octo-server) at least once before re-requesting review; the source-repo CI on this PR alone cannot exercise the cross-repo path.

Once those are in, this is a clean win.

P0-1: checkout now explicitly targets Mininglamp-OSS/.github (not the
caller repo) via repository: Mininglamp-OSS/.github + path: _shared.
Updates run commands to python3 _shared/scripts/xxx.py.

P0-2: add contents: read to permissions on all three workflows so
actions/checkout can clone the scripts repo (contents: none broke
checkout on private callers).

P1-1: ref: ${{ github.workflow_sha }} ensures script version is always
in sync with the executing workflow commit.

P1-2: sparse-checkout: scripts fetches only the scripts/ subtree,
avoiding a full repo clone for a single Python file.
Copy link
Copy Markdown
Contributor Author

@lml2468 lml2468 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed all P0/P1 items:

P0-1actions/checkout now has repository: Mininglamp-OSS/.github so it always clones this repo, not the caller. Scripts live at _shared/scripts/. All three run: lines updated to python3 _shared/scripts/<file>.py.

P0-2 — Added contents: read to permissions on all three workflows (needed by actions/checkout to clone the repo on private callers).

P1-1ref: ${{ github.workflow_sha }} pins the checkout to the exact reusable-workflow commit so script and workflow are always in sync.

P1-2sparse-checkout: scripts + sparse-checkout-cone-mode: false so only the scripts/ subtree is cloned.

persist-credentials: false retained as suggested.

Re-requesting your review.

@lml2468 lml2468 requested a review from yujiawei May 14, 2026 03:00
Copy link
Copy Markdown

@yujiawei yujiawei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #8 (.github)

Summary

Pure refactor that extracts three Python heredoc scripts from reusable workflow YAML files into standalone files under scripts/. The diff is well-scoped, the new files are byte-equivalent to the heredocs they replace, and the supporting workflow changes (checkout step, permissions) are minimal and correct.

Verdict: APPROVED. No blocking issues. A few P2 suggestions inline below.

1. Verification — claims vs. diff

Claim Result Evidence
Pure refactor, no logic changes Line-by-line comparison of each heredoc against its new .py file shows identical statements, identical env var reads, identical group IDs (4ade985d…, 151a4597…, 1c303c14…), identical message templates, identical exit-code semantics.
python3 -m py_compile succeeds for all three Files are syntactically valid stdlib-only Python (no import of third-party packages).
actions/github-script workflows unaffected pr-merged-done.yml, issue-welcome.yml, auto-add-to-project.yml — none touched in the diff.

2. Findings

P0 / P1 — Blocking or significant

None.

P2 — Suggestions, not blocking

P2-1 — Boilerplate duplication across the three scripts
scripts/octo_ci_status_notify.py:104-117, scripts/octo_issue_feed_notify.py:25-38, scripts/octo_pr_feed_notify.py:30-43 each define a near-identical send(group_id, msg) helper, the same failed = [] accumulator, the same https://im.deepminer.com.cn/api/v1/bot/sendMessage constant, and the same sys.exit(1) if failed tail.

If a 4th notifier is ever added, this becomes worth factoring into a scripts/_octo.py helper (e.g. octo.send_to_groups([(gid, msg), ...]) -> int). For three callers it's defensible to keep it inline. Flagging only so it doesn't get copy-pasted indefinitely.

P2-2 — per_page=10 window in CI-status notify is fragile under high churn
scripts/octo_ci_status_notify.py:31-37 fetches the 10 most recent completed runs on main, then filters by workflow name. On a busy repo where >10 unrelated workflows complete between two runs of the same workflow, the previous run can fall out of the window and previous becomes None, which silently suppresses the recovery alert (prev_conclusion is None guard at line 65).

This is pre-existing behavior — the heredoc had the same per_page=10 — so it is not introduced by this PR. Worth filing a follow-up to either bump per_page or filter server-side by workflow file path.

P2-3 — Broad except Exception on label parsing
scripts/octo_issue_feed_notify.py:8-11:

try:
    labels = json.loads(os.environ['ISSUE_LABELS'])
    labels_part = ' · 🏷️ ' + ', '.join(labels) if labels else ''
except Exception:
    labels_part = ''

Catches KeyError (env var missing), json.JSONDecodeError, and TypeError from ', '.join(labels) when labels is a list of dicts rather than strings. The original heredoc had the same pattern, so this is unchanged behavior — but if/when this is touched again, narrowing to (KeyError, json.JSONDecodeError, TypeError) would be safer.

3. Workflow-side changes — review

ref: ${{ github.workflow_sha }} is the right choice (✅)

octo-ci-status.yml:38-46, octo-issue-feed.yml:42-50, octo-pr-feed.yml:51-59 all check out Mininglamp-OSS/.github at ${{ github.workflow_sha }}. For reusable workflows triggered via workflow_call, GitHub resolves github.workflow_sha to the commit SHA of the reusable workflow file's repo (not the caller's SHA). This guarantees the .py script that runs is the exact version paired with the YAML being executed — no version skew when callers pin to different refs of .github.

This is the intended idiom for the "extract scripts out of a reusable workflow" pattern and is materially safer than ref: main (which would couple every caller to whatever .github is at the moment the job starts, defeating ref-pinning by callers).

Other workflow choices — all correct

  • actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 — SHA-pinned to a real v4.2.2 commit. ✅
  • persist-credentials: false — minimizes token-on-disk exposure for the rest of the job. ✅
  • sparse-checkout: scripts + sparse-checkout-cone-mode: false — fetches only what's needed; cone-mode: false is correct because scripts is being used as a literal path pattern, not a cone root. ✅
  • path: _shared — underscore-prefixed to avoid colliding with the caller's workspace contents (the reusable workflow doesn't checkout the caller's code, but the convention is still good hygiene). ✅
  • Permissions: octo-ci-status.yml adds contents: read to the existing actions: read; octo-issue-feed.yml and octo-pr-feed.yml go from permissions: {} to permissions: { contents: read }. Both are the minimum needed for the new checkout step on a public repo. ✅
  • Mininglamp-OSS/.github repo visibility is public, so the default GITHUB_TOKEN from any caller (including external orgs) can clone it. ✅

4. Additional observations (not in scope but noticed)

  • The new .py files have no shebang and are not marked executable. This is fine because the workflow invokes them as python3 _shared/scripts/octo_*.py. No change needed unless someone wants to run them as ./scripts/foo.py locally; if so, add #!/usr/bin/env python3 later.
  • No if __name__ == "__main__": guard. Acceptable because none of these scripts are imported by anything; they run top-level. Adding the guard is only valuable if a future refactor wants to unit-test internal functions.
  • The PR body says verification is python3 -m py_compile. Compile-only does not catch runtime issues like missing env vars, malformed JSON, or HTTP errors. For a "not introducing bugs" claim, that's adequate; for ongoing maintenance, a small smoke test that wires fake env vars and a fake urllib.request.urlopen would catch regressions cheaply. Not requesting it for this PR.

5. Recommendation

Approve and merge. This PR materially improves maintainability (Python is no longer hidden inside YAML strings), preserves runtime behavior exactly, and uses the correct GitHub Actions primitives for a reusable-workflow + script-extraction pattern. The three P2 items are follow-up suggestions, not pre-merge requirements.

@lml2468 lml2468 merged commit 033462e into main May 14, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants