From e2c4b0d8d7d9e488d6392926ffc8f1e2481c9269 Mon Sep 17 00:00:00 2001 From: hyperpolymath <6759885+hyperpolymath@users.noreply.github.com> Date: Tue, 26 May 2026 10:20:25 +0100 Subject: [PATCH] fix(governance): TS allowlist regex + introduce `.governance-allowlist` (Layer 2.5) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The `language-policy` job in `governance-reusable.yml` parses per-repo TypeScript exemptions from a markdown table in `.claude/CLAUDE.md`, but the original heading-match regex was the literal pattern `TypeScript [Ee]xemptions` — accepting only the exact two-word heading. Any repo with a different heading text silently failed to parse and the workflow then false-failed on every PR. Concrete fallout (root cause traced 2026-05-26): * `affinescript`'s heading is `### TypeScript / JavaScript Exemptions (Approved)` — the slash and `JavaScript` are between the two keywords, so the literal regex never matched. Three legitimate exemptions (`packages/affine-js/types.d.ts`, `packages/affinescript-cli/mod.d.ts`, `affinescript-deno-test/*.ts`) were silently ignored, and the check has been red on every PR in that repo for as long as the heading text has carried the extra words. Documented as known baseline noise in affinescript's CLAUDE.md "Known-failing baseline checks" section; this fix resolves the actual root cause. Two-layer fix: 1. **Regex relaxation + multi-table support.** The exemption-heading regex is now `(?:TypeScript|JavaScript|TS|JS|\.tsx?)\b[^#\n]*[Ee]xemption` so it matches the slash-form heading, the singular form, and `.ts`-mentioning variants. The loop is also no longer "break on first heading" — it now scans every heading and re-evaluates whether the next section is another exemption table, so a CLAUDE.md with multiple exemption tables (e.g. TypeScript + Runtime, or split per-platform) gets all of them parsed instead of just the first. 2. **`.governance-allowlist` mechanism (Layer 2.5).** New optional plain-text allowlist file at the repo root. One glob per line, `#`-comments supported. Decoupled from `.claude/CLAUDE.md` heading text — so a repo can carry a stable allowlist that survives prose rewrites and isn't sensitive to which exact words appear in the markdown. Both sources merge on every check; either alone is sufficient. Failure message now mentions the new file in the resolution menu, and the parsed-count line credits both sources. Test plan covered: * Regex matches 11/11 representative heading variants in a Python smoke (affinescript's exact `### TypeScript / JavaScript Exemptions (Approved)` → matches; `### Runtime Exemptions` → correctly does NOT match; `### Closed exemptions:` prose → correctly does NOT match). * End-to-end parse of affinescript's `.claude/CLAUDE.md` against the patched logic: yields 3 exemptions (was 0 with the old regex). All three paths line up with the documented table rows. Docs: `docs/EXEMPTION-MECHANISMS.adoc` gains a "Layer 2.5" section explaining when to reach for the new file vs the markdown table (Layer 1.5 number kept clear of the existing 3-layer numbering to avoid renumber-churn; the file already documents bot-denylist as Layer 1 and Hypatia baseline as Layer 2). Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/governance-reusable.yml | 43 +++++++++++++++++++--- docs/EXEMPTION-MECHANISMS.adoc | 45 +++++++++++++++++++++++ 2 files changed, 83 insertions(+), 5 deletions(-) diff --git a/.github/workflows/governance-reusable.yml b/.github/workflows/governance-reusable.yml index 880db774..0f068243 100644 --- a/.github/workflows/governance-reusable.yml +++ b/.github/workflows/governance-reusable.yml @@ -172,15 +172,47 @@ jobs: return re.compile('^' + ''.join(out) + '$') exemption_patterns = [] + + # Source 1 — `.governance-allowlist` (plain glob lines, optional + # `# comment`). This is the foundational, format-stable allowlist + # mechanism: each repo can carry one file with one glob per line, + # decoupled from any prose-heading text in `.claude/CLAUDE.md`. The + # `.claude/CLAUDE.md` Exemptions tables remain authoritative for + # human-facing rationale; this file is the machine-readable index. + # Added 2026-05-26 (Refs affinescript false-positive sweep). + allowlist_file = pathlib.Path('.governance-allowlist') + if allowlist_file.exists(): + for raw_line in allowlist_file.read_text(encoding='utf-8').splitlines(): + stripped = raw_line.split('#', 1)[0].strip() + if stripped: + exemption_patterns.append((stripped, glob_to_regex(stripped))) + + # Source 2 — `.claude/CLAUDE.md` "Exemptions" markdown tables. + # Multiple tables per file are supported (TS-specific, TS+JS, + # Runtime+TS, etc.); the regex permits any heading whose text + # mentions TypeScript / JavaScript / TS / JS together with + # "Exemption(s)". Previously the regex was `TypeScript [Ee]xemptions` + # which is a literal pattern requiring exactly one space, so a + # heading like `### TypeScript / JavaScript Exemptions (Approved)` + # (as carried by affinescript) silently failed to match and the + # repo's exemptions were never loaded -- leading to a false-fail + # on every PR in that repo. Also: the previous loop broke on the + # *first* heading after entering the table, which misses any + # *additional* exemption tables further down the file. + heading_re = re.compile(r'^#{1,6}\s') + exemption_heading_re = re.compile( + r'(?:TypeScript|JavaScript|TS|JS|\.tsx?)\b[^#\n]*[Ee]xemption', re.IGNORECASE) claude_md = pathlib.Path('.claude/CLAUDE.md') if claude_md.exists(): in_table = False for line in claude_md.read_text(encoding='utf-8').splitlines(): - if re.search(r'TypeScript [Ee]xemptions', line): - in_table = True + if heading_re.match(line): + # Enter a new section: in_table iff this heading is an + # exemption heading. Multiple exemption sections per + # file are allowed; non-exemption headings close any + # open table. + in_table = bool(exemption_heading_re.search(line)) continue - if in_table and line.startswith(('### ', '## ', '# ')): - break if in_table and line.startswith('|'): m = re.match(r'\|\s*`([^`]+)`', line) if m: @@ -214,8 +246,9 @@ jobs: print(" (a) migrate the file to AffineScript") print(" (b) move to an allowlisted bridge path") print(" (c) add an entry to the 'TypeScript Exemptions' table in .claude/CLAUDE.md") + print(" (d) add a one-line glob to `.governance-allowlist` at the repo root") if exemption_patterns: - print(f"\n(Currently {len(exemption_patterns)} exemption(s) parsed from .claude/CLAUDE.md.)") + print(f"\n(Currently {len(exemption_patterns)} exemption(s) parsed from .claude/CLAUDE.md + .governance-allowlist.)") sys.exit(1) print(f"✅ No TypeScript files outside allowlist ({len(exemption_patterns)} per-repo exemption(s) parsed).") PYEOF diff --git a/docs/EXEMPTION-MECHANISMS.adoc b/docs/EXEMPTION-MECHANISMS.adoc index d06878d3..c5daa25f 100644 --- a/docs/EXEMPTION-MECHANISMS.adoc +++ b/docs/EXEMPTION-MECHANISMS.adoc @@ -78,6 +78,51 @@ Suppresses matching findings from the gate, downgrades severity if * One-off PR-scoped suppression. Baseline edits are merged to main; they affect all subsequent PRs. See Layer 3. +== Layer 2.5: Per-repo banned-language file allowlist (`.governance-allowlist`) + +**File:** `.governance-allowlist` at the repo root (optional). + +**Scope:** One repo. Consumed by the `language-policy` job of +`governance-reusable.yml` to allowlist specific paths from the +TypeScript / JavaScript / banned-language file checks. Sits alongside +the human-facing `### TypeScript Exemptions (Approved)` table in +`.claude/CLAUDE.md` — the markdown table remains the rationale, the +allowlist file is the machine-readable index that the workflow +actually parses. + +**Format:** Plain text, one glob per line. `#` introduces a +line-comment; blank lines are ignored. + +**Example:** + +[source] +---- +# .governance-allowlist +# Approved TypeScript carve-outs — see .claude/CLAUDE.md table for rationale. +packages/affine-js/types.d.ts +packages/affinescript-cli/mod.d.ts +affinescript-deno-test/*.ts +---- + +**Why this exists.** The original mechanism scraped `.claude/CLAUDE.md` +for a markdown table after a heading matching the literal regex +`TypeScript [Ee]xemptions`. Any repo with a different heading text +(e.g. affinescript's `### TypeScript / JavaScript Exemptions (Approved)`) +silently failed to parse — leading to a false-fail on every PR. +`.governance-allowlist` decouples the machine-readable allowlist from +the prose. Both sources are merged on every check; either alone is +sufficient. + +**When to use which.** Use the CLAUDE.md table when you want the +rationale + unblock-condition columns to live with the human-facing +policy. Use `.governance-allowlist` when you want a stable file +format that survives `.claude/CLAUDE.md` rewrites, or when the repo +has multiple exemption tables with awkward headings. + +**Don't** use this for ReScript, Python, V-lang, Go, or any other +fully-banned language. Those have no allowlist mechanism — they must +be removed (no exemptions, ever — per estate policy). + == Layer 3: Per-PR exemptions (NOT YET IMPLEMENTED) There is currently no supported per-PR exemption mechanism. PR authors