Skip to content

feat(agents,skills): add opensource-pipeline — 3-agent workflow for safe public releases#1036

Merged
affaan-m merged 3 commits intoaffaan-m:mainfrom
herakles-dev:add-opensource-pipeline
Mar 31, 2026
Merged

feat(agents,skills): add opensource-pipeline — 3-agent workflow for safe public releases#1036
affaan-m merged 3 commits intoaffaan-m:mainfrom
herakles-dev:add-opensource-pipeline

Conversation

@herakles-dev
Copy link
Copy Markdown
Contributor

@herakles-dev herakles-dev commented Mar 31, 2026

What Changed

Adds a 3-agent pipeline + orchestrator skill for safely open-sourcing private projects. The pipeline chains three specialized agents through a staging directory before anything touches GitHub.

Files added:

  • agents/opensource-forker.md — Stage 1: copies project, strips secrets (20+ regex patterns), replaces internal references with placeholders, generates .env.example, initializes clean git history
  • agents/opensource-sanitizer.md — Stage 2: independent PASS/FAIL audit across 6 scan categories (secrets, PII, internal refs, dangerous files, config completeness, git history); read-only, never trusts the forker
  • agents/opensource-packager.md — Stage 3: generates CLAUDE.md, setup.sh, README.md, LICENSE, CONTRIBUTING.md, and GitHub issue templates
  • skills/opensource-pipeline/SKILL.md — Orchestrator skill that routes /opensource commands and chains the three agents

Why This Change

Preparing a private project for open-source is error-prone — a single leaked API key or internal domain reference can cause a security incident. This pipeline enforces a zero-trust, staged workflow:

  1. Fork strips secrets and internal references automatically
  2. Sanitize independently verifies the fork with 20+ detection patterns — FAIL blocks release
  3. Package generates professional, Claude Code-ready documentation

The sanitizer is intentionally paranoid: false positives are acceptable, false negatives are not. A single CRITICAL finding in any category blocks the release.

Source project: https://github.com/herakles-dev/opensource-pipeline (MIT licensed)

Testing Done

  • Tested end-to-end on a real private project (herakles-dev/opensource-pipeline itself)

  • Forker correctly detected and stripped database URLs, API tokens, and internal domain references

  • Sanitizer generated a PASS WITH WARNINGS report (one hardcoded port, non-blocking)

  • Packager generated accurate CLAUDE.md with correct commands verified against the actual project

  • All three agents invoke correctly via the skill's /opensource fork command

  • Manual testing completed

  • Automated tests pass locally (node tests/run-all.js) — agents/skills are Markdown, no JS tests needed

  • Edge cases considered: FAIL verdict stops pipeline, user prompted to fix or abort

Type of Change

  • fix: Bug fix
  • feat: New feature
  • refactor: Code refactoring
  • docs: Documentation
  • test: Tests
  • chore: Maintenance/tooling
  • ci: CI/CD changes

Security & Quality Checklist

  • No secrets or API keys committed (ghp_, sk-, AKIA, xoxb, xoxp patterns checked)
  • JSON files validate cleanly (no JSON files added)
  • Shell scripts pass shellcheck (no shell scripts added — setup.sh is generated by the agent, not committed here)
  • No sensitive data exposed in logs or output
  • Follows conventional commits format

Documentation

  • Agent files are self-documenting (role, workflow, examples, rules sections)
  • Skill file documents all commands and the full pipeline protocol
  • README updated (not needed — agents/skills directories are self-describing)

Summary by cubic

Adds a 3‑agent open‑source release pipeline with an orchestrator skill to safely prepare private projects for public GitHub. Strengthens secret detection, verification flow, and publish reliability; now flags *.map files as dangerous.

  • New Features

    • opensource-forker — stages a copy, strips secrets (20+ patterns), replaces internal refs, creates .env.example, fresh git init, and FORK_REPORT.md.
    • opensource-sanitizer — read-only audit with PASS/FAIL/PASS‑WITH‑WARNINGS across secrets, PII, internal refs, dangerous files, config completeness, and git history; CRITICAL findings block release; writes SANITIZATION_REPORT.md.
    • opensource-packager — generates CLAUDE.md, executable setup.sh, README.md, LICENSE, CONTRIBUTING.md, and GitHub issue templates.
    • opensource-pipeline skill — chains the agents and routes /opensource fork|verify|package|status|list; stops on FAIL and prompts for fixes before proceeding.
  • Bug Fixes

    • Safer publish flow: collect GitHub org/username; quote vars in gh commands.
    • Sanitizer retry loop capped at 3; final review shows dynamic verdict.
    • Broader detection: full JWT (3‑segment) match; case‑insensitive AWS; expand GitHub tokens (gho_, ghu_); tighter generic env heuristic; separate WARNINGs from CRITICAL.
    • Wider internal path checks and exclusions: detect macOS /Users/ and Windows C:\Users\; rsync now excludes .env*, .claude/, .secrets/, secrets/.
    • Dangerous files: flag *.map as CRITICAL in sanitizer and remove via forker.
    • Clarify sanitizer is source read‑only (only writes the report).

Written for commit 90ccb96. Summary will update on new commits.

Summary by CodeRabbit

  • New Features

    • Added a complete open-source pipeline with user commands to fork (create a sanitized staging copy), verify (read-only safety scans with PASS/FAIL/WARN), and package (generate docs and bootstrap).
    • Automated detection/removal of secrets, PII, internal references, and dangerous files; generates env examples, replaces sensitive refs with placeholders, and creates a clean single-commit staging history.
  • Documentation

    • Produces sanitization and fork reports, a concise package guide, an executable setup/bootstrap script, license/contributing templates, and a final review step before publishing.

…lease workflow

Adds a complete pipeline for safely preparing private projects for public
release: secret stripping (20+ patterns), independent sanitization audit,
and professional doc generation (CLAUDE.md, setup.sh, README, LICENSE).

Agents added:
- agents/opensource-forker.md    — copies project, strips secrets, generates .env.example
- agents/opensource-sanitizer.md — independent PASS/FAIL audit, read-only, 20+ patterns
- agents/opensource-packager.md  — generates CLAUDE.md, setup.sh, README, LICENSE, CONTRIBUTING

Skill added:
- skills/opensource-pipeline/SKILL.md — orchestrator: routes /opensource commands, chains agents

Source: https://github.com/herakles-dev/opensource-pipeline (MIT)
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 31, 2026

📝 Walkthrough

Walkthrough

Adds a new three-stage open‑source release pipeline: an Open-Source Forker that stages and sanitizes a repo, an Open-Source Sanitizer that performs deterministic multi-category security/PII checks and emits a report, and an Open-Source Packager that generates packaging docs/scripts; plus a skill wiring these agents into CLI commands and flow control.

Changes

Cohort / File(s) Summary
Agents – forking, sanitizing, packaging
agents/opensource-forker.md, agents/opensource-sanitizer.md, agents/opensource-packager.md
Add three new agent specifications: forker (staging, secret removal, placeholder replacement, single-commit history, FORK_REPORT.md), sanitizer (deterministic multi-step CRITICAL/WARNING scans and SANITIZATION_REPORT.md), packager (generate CLAUDE.md, setup.sh, README/LICENSE/CONTRIBUTING, issue templates). Attention: secret regexes, file-exclusion rules, and report redaction rules.
Skill – pipeline orchestration
skills/opensource-pipeline/SKILL.md
Add a new skill that wires the three agents into commands (/opensource fork

Sequence Diagram(s)

mermaid
sequenceDiagram
autonumber
actor User
participant Skill as "opensource-pipeline (CLI)"
participant Forker as "opensource-forker (Agent)"
participant Staging as "Staging Dir / Filesystem"
participant Sanitizer as "opensource-sanitizer (Agent)"
participant Packager as "opensource-packager (Agent)"
participant Git as "git / gh"
participant Reports as "FORK_REPORT.md / SANITIZATION_REPORT.md"

User->>Skill: run /opensource fork PROJECT
Skill->>Forker: invoke forker with source path & options
Forker->>Staging: copy files (rsync rules), remove/strip secrets, replace refs
Forker->>Git: initialize repo, single "Initial open-source release" commit
Forker->>Reports: write FORK_REPORT.md
Skill->>Sanitizer: run sanitizer on Staging
Sanitizer->>Staging: scan files, git history, produce verdict
Sanitizer->>Reports: write SANITIZATION_REPORT.md
alt SANITIZER PASS/WARN
Skill->>Packager: run packager to generate docs and setup.sh
Packager->>Staging: create CLAUDE.md, setup.sh, README, LICENSE, templates
Packager->>Reports: append packaging summary
Skill->>User: prompt for approval to publish
User->>Skill: approve
Skill->>Git: gh repo create --public --source=. --push
else SANITIZER FAIL
Skill->>User: present findings, offer fix/retry or abort
end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • affaan-m

Poem

"I nibbled through the code today, so bright—
I plucked the secrets out by moonlit byte.
I patched the paths and hopped the legal fence,
Left tidy notes, and made a staging fence.
A rabbit's nod: ship clean, then take flight!" 🐇

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: adding a three-agent workflow (opensource-pipeline) for preparing projects for safe public release, which directly aligns with all four new files added.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ecc-tools
Copy link
Copy Markdown
Contributor

ecc-tools bot commented Mar 31, 2026

Analyzing 5000 commits...

@ecc-tools
Copy link
Copy Markdown
Contributor

ecc-tools bot commented Mar 31, 2026

Analysis Failed

Not Found - https://docs.github.com/rest/git/refs#get-a-reference

Troubleshooting
Cause Resolution
Large repository Analysis may timeout on repos with extensive history
API rate limits Wait 15 minutes before retrying
Network issues Queue timeout is 15 minutes; retry may succeed
Permissions Verify app has Contents: Read access

Retry: /ecc-tools analyze


Report Issue | ECC Tools

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 issues found across 4 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="skills/opensource-pipeline/SKILL.md">

<violation number="1" location="skills/opensource-pipeline/SKILL.md:176">
P1: Publish step uses `{org}/{repo}` without collecting/resolving `{org}` (and with variable mismatch vs `{github_repo}`), which can break final GitHub repo creation.</violation>

<violation number="2" location="skills/opensource-pipeline/SKILL.md:176">
P1: User-controlled placeholders are interpolated into shell command arguments without safe quoting/escaping, enabling command injection or malformed command execution.</violation>
</file>

<file name="agents/opensource-forker.md">

<violation number="1" location="agents/opensource-forker.md:40">
P1: Sensitive `.env` variants are not excluded during initial rsync copy, so secrets can enter staging before later cleanup.</violation>

<violation number="2" location="agents/opensource-forker.md:46">
P1: Step 3 wording can cause real secret values to be copied into `.env.example` and committed, creating a direct secret leakage risk.</violation>

<violation number="3" location="agents/opensource-forker.md:86">
P1: Blanket deletion of `*.pem/*.key/*.p12/*.pfx` is over-broad and can remove required non-secret cert artifacts, contradicting the ‘never remove functionality’ rule.</violation>

<violation number="4" location="agents/opensource-forker.md:104">
P2: Private IP sanitization guidance is incomplete and omits the common RFC1918 172.16.0.0/12 range.</violation>
</file>

<file name="agents/opensource-packager.md">

<violation number="1" location="agents/opensource-packager.md:49">
P2: Nested fenced code examples are escaped (`\`\`\``) inside triple-backtick blocks, which can be copied literally and break Markdown rendering in generated docs.</violation>
</file>

<file name="agents/opensource-sanitizer.md">

<violation number="1" location="agents/opensource-sanitizer.md:32">
P1: AWS secret-access-key regex only matches a narrow lowercase form and misses common env-var formats.</violation>

<violation number="2" location="agents/opensource-sanitizer.md:44">
P1: GitHub token regex is too narrow and can miss valid token prefixes, creating false negatives in a critical secret scan.</violation>

<violation number="3" location="agents/opensource-sanitizer.md:81">
P1: Internal path leak detection is too narrow for a CRITICAL sanitizer check, causing likely false negatives.</violation>

<violation number="4" location="agents/opensource-sanitizer.md:124">
P2: The sanitizer spec is internally contradictory: it is declared read-only but requires writing `SANITIZATION_REPORT.md` into the project directory.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.


**Files to always remove:**
- `.env` and variants (`.env.local`, `.env.production`, `.env.development`)
- `*.pem`, `*.key`, `*.p12`, `*.pfx` (private keys)
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Blanket deletion of *.pem/*.key/*.p12/*.pfx is over-broad and can remove required non-secret cert artifacts, contradicting the ‘never remove functionality’ rule.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At agents/opensource-forker.md, line 86:

<comment>Blanket deletion of `*.pem/*.key/*.p12/*.pfx` is over-broad and can remove required non-secret cert artifacts, contradicting the ‘never remove functionality’ rule.</comment>

<file context>
@@ -0,0 +1,196 @@
+
+**Files to always remove:**
+- `.env` and variants (`.env.local`, `.env.production`, `.env.development`)
+- `*.pem`, `*.key`, `*.p12`, `*.pfx` (private keys)
+- `credentials.json`, `service-account.json`
+- `.secrets/`, `secrets/`
</file context>
Fix with Cubic


### Step 3: Secret Detection and Stripping

Scan ALL files for these patterns. Extract values to `.env.example` rather than deleting them:
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Step 3 wording can cause real secret values to be copied into .env.example and committed, creating a direct secret leakage risk.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At agents/opensource-forker.md, line 46:

<comment>Step 3 wording can cause real secret values to be copied into `.env.example` and committed, creating a direct secret leakage risk.</comment>

<file context>
@@ -0,0 +1,196 @@
+
+### Step 3: Secret Detection and Stripping
+
+Scan ALL files for these patterns. Extract values to `.env.example` rather than deleting them:
+
+```
</file context>
Suggested change
Scan ALL files for these patterns. Extract values to `.env.example` rather than deleting them:
Scan ALL files for these patterns. Remove secret values from source files and add only variable names with safe placeholder values to `.env.example` (never copy real secrets).
Fix with Cubic

| Custom internal domains | `your-domain.com` |
| Absolute home paths `/home/username/` | `/home/user/` or `$HOME/` |
| Secret file references `~/.secrets/` | `.env` |
| Private IPs `192.168.x.x`, `10.x.x.x` | `your-server-ip` |
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Private IP sanitization guidance is incomplete and omits the common RFC1918 172.16.0.0/12 range.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At agents/opensource-forker.md, line 104:

<comment>Private IP sanitization guidance is incomplete and omits the common RFC1918 172.16.0.0/12 range.</comment>

<file context>
@@ -0,0 +1,196 @@
+| Custom internal domains | `your-domain.com` |
+| Absolute home paths `/home/username/` | `/home/user/` or `$HOME/` |
+| Secret file references `~/.secrets/` | `.env` |
+| Private IPs `192.168.x.x`, `10.x.x.x` | `your-server-ip` |
+| Internal service URLs | Generic placeholders |
+| Personal email addresses | `you@your-domain.com` |
</file context>
Suggested change
| Private IPs `192.168.x.x`, `10.x.x.x` | `your-server-ip` |
| Private IPs `10.x.x.x`, `172.16-31.x.x`, `192.168.x.x` | `your-server-ip` |
Fix with Cubic


## Quick Start

\`\`\`bash
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Nested fenced code examples are escaped (\```) inside triple-backtick blocks, which can be copied literally and break Markdown rendering in generated docs.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At agents/opensource-packager.md, line 49:

<comment>Nested fenced code examples are escaped (`\`\`\``) inside triple-backtick blocks, which can be copied literally and break Markdown rendering in generated docs.</comment>

<file context>
@@ -0,0 +1,249 @@
+
+## Quick Start
+
+\`\`\`bash
+./setup.sh              # First-time setup
+{dev command}           # Start development server
</file context>
Fix with Cubic

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
agents/opensource-sanitizer.md (1)

18-18: “Read-only” rule should explicitly allow report generation only.

Current wording can be read as “no writes at all,” which conflicts with generating SANITIZATION_REPORT.md.

Also applies to: 124-124

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/opensource-sanitizer.md` at line 18, Update the "Read-only" rule text
(the line containing "**Read-only** — you never modify files, only report" in
agents/opensource-sanitizer.md and the duplicate at the later occurrence) to
explicitly exempt generation of the SANITIZATION_REPORT.md file; e.g. clarify
that modifying source files is prohibited but producing a sanitization report
(SANITIZATION_REPORT.md) is allowed, so readers cannot interpret the rule as
forbidding report generation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agents/opensource-forker.md`:
- Around line 59-61: The current JWT regex under "JWT tokens" only matches two
segments and can miss real tokens; update the pattern to match full
three-segment JWTs (header.payload.signature) by replacing the existing regex
eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+ with a three-segment pattern (e.g.,
[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+) so the tokenizer/stripper
reliably finds and removes complete JWTs.

In `@agents/opensource-sanitizer.md`:
- Around line 22-52: The "Step 1: Secrets Scan (CRITICAL — any match = FAIL)"
rule is inconsistent with the "High-entropy strings in config files (WARNING —
manual review)" pattern; update the spec so severities are consistent by either
(A) elevating the high-entropy pattern to CRITICAL (make its comment match
"CRITICAL — any match = FAIL") if you want it to auto-fail, or (B) move the
high-entropy pattern out of Step 1 into a new "Step 2: Heuristics (WARNING —
manual review)" section and keep Step 1 strictly CRITICAL patterns; specifically
edit the header text "Step 1: Secrets Scan (CRITICAL — any match = FAIL)" or the
comment on the pattern line "High-entropy strings in config files (WARNING —
manual review)" to make their intended behavior unambiguous.
- Around line 37-39: The current "pattern" value only matches two-part JWTs
(header.payload); update the pattern entry named "pattern" to require the
standard three JWT segments (header.payload.signature) by ensuring a third
dot-separated base64url segment (A-Za-z0-9_- with the same minimum length) is
matched so full 3-part tokens are detected; replace the existing two-part regex
with one that enforces three parts for the "pattern" symbol.

In `@skills/opensource-pipeline/SKILL.md`:
- Around line 116-117: The summary text currently hardcodes "Sanitization:
PASS"—replace that static string with the actual sanitization result variable
and conditional rendering so the final review can show "PASS", "PASS WITH
WARNINGS", or "FAIL" as appropriate; update every occurrence of the literal
"Sanitization: PASS" (including the block around lines 116 and the range
148-163) to use the sanitization status variable (e.g., sanitizationStatus or
equivalent) and format the message dynamically so it reflects the true outcome
instead of always showing PASS.
- Around line 11-28: The SKILL.md is missing the required "How It Works" and
"Examples" sections; update the document by adding a "How It Works" section
after "Commands" that explains the pipeline flow (fork → sanitize → package),
sanitizer behavior, and when each command (`/opensource fork`, `/opensource
verify`, `/opensource package`, `/opensource list`, `/opensource status`) is
used, and add an "Examples" section with at least two concrete usage examples
(e.g., invoking `/opensource fork PROJECT` to perform full pipeline and
`/opensource verify PROJECT` to run sanitizer) demonstrating expected inputs and
outputs; ensure headings match the required pattern ("How It Works", "Examples")
and integrate with existing "When to Activate", "Commands", and "Protocol"
sections.

---

Nitpick comments:
In `@agents/opensource-sanitizer.md`:
- Line 18: Update the "Read-only" rule text (the line containing "**Read-only**
— you never modify files, only report" in agents/opensource-sanitizer.md and the
duplicate at the later occurrence) to explicitly exempt generation of the
SANITIZATION_REPORT.md file; e.g. clarify that modifying source files is
prohibited but producing a sanitization report (SANITIZATION_REPORT.md) is
allowed, so readers cannot interpret the rule as forbidding report generation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d2028beb-fa96-4767-8b84-3314c9406f1f

📥 Commits

Reviewing files that changed from the base of the PR and between e68233c and 99a09d9.

📒 Files selected for processing (4)
  • agents/opensource-forker.md
  • agents/opensource-packager.md
  • agents/opensource-sanitizer.md
  • skills/opensource-pipeline/SKILL.md

Comment on lines +11 to +28
## When to Activate

- User says "open source this project" or "make this public"
- User wants to prepare a private repo for public release
- User needs to strip secrets before pushing to GitHub
- User invokes `/opensource fork`, `/opensource verify`, or `/opensource package`

## Commands

| Command | Action |
|---------|--------|
| `/opensource fork PROJECT` | Full pipeline: fork + sanitize + package |
| `/opensource verify PROJECT` | Run sanitizer on existing repo |
| `/opensource package PROJECT` | Generate CLAUDE.md + setup.sh + README |
| `/opensource list` | Show all staged projects |
| `/opensource status PROJECT` | Show reports for a staged project |

## Protocol
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add required skill sections: “How It Works” and “Examples”.

This skill doc currently lacks the required section structure for skills/**/*.md.

As per coding guidelines "Skill format must be Markdown with clear sections for When to Use, How It Works, and Examples."

Also applies to: 224-254

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/opensource-pipeline/SKILL.md` around lines 11 - 28, The SKILL.md is
missing the required "How It Works" and "Examples" sections; update the document
by adding a "How It Works" section after "Commands" that explains the pipeline
flow (fork → sanitize → package), sanitizer behavior, and when each command
(`/opensource fork`, `/opensource verify`, `/opensource package`, `/opensource
list`, `/opensource status`) is used, and add an "Examples" section with at
least two concrete usage examples (e.g., invoking `/opensource fork PROJECT` to
perform full pipeline and `/opensource verify PROJECT` to run sanitizer)
demonstrating expected inputs and outputs; ensure headings match the required
pattern ("How It Works", "Examples") and integrate with existing "When to
Activate", "Commands", and "Protocol" sections.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 31, 2026

Greptile Summary

This PR adds a 3-agent open-source release pipeline (opensource-forkeropensource-sanitizeropensource-packager) chained by an opensource-pipeline skill. The pipeline copies a private project to a staging directory, strips secrets using 20+ regex patterns, independently audits the fork for leaks, and then generates CLAUDE.md, setup.sh, README.md, LICENSE, and CONTRIBUTING.md. Several issues flagged in previous review rounds have been addressed (.claude/ rsync exclusion, GitHub org collection, retry cap on sanitizer failures, generic env pattern threshold raised to 16+).

Key findings:

  • P1 — Packager output is never published: The packager agent writes all documentation files to the staging directory but neither the packager agent nor the skill orchestrator includes a git add && git commit step after Step 5 completes. gh repo create --source=. --push in Step 7 pushes only the forker's initial commit, so CLAUDE.md, setup.sh, README.md, LICENSE, and CONTRIBUTING.md are silently omitted from the GitHub repo. This defeats the primary value of the packager stage.

  • P2 — Git history audit truncated to 20 lines: The sanitizer's Step 6 pipes git log -p through grep | head -20, examining only the first 20 matching lines. For projects with many files, secrets in files that sort late alphabetically are missed entirely.

  • The standalone /opensource package prompt stub and missing {github_org} in the packager's prompt context were flagged in prior review rounds and remain open.

Confidence Score: 4/5

Safe to merge with the packager commit gap fixed; the pipeline is well-structured but currently publishes repos missing all generated documentation.

One P1 defect: the packager's output is never committed, so the GitHub publish step omits it entirely. This is a clear, present bug on the primary user path. The P2 git-history truncation is a minor audit gap. All prior P1/P0 concerns from previous rounds have been addressed.

skills/opensource-pipeline/SKILL.md (missing git commit between Step 5 and Step 7) and agents/opensource-sanitizer.md (head -20 truncation in git history audit)

Important Files Changed

Filename Overview
agents/opensource-forker.md Copies project to staging, strips secrets via 20+ regex patterns, replaces internal refs, generates .env.example and FORK_REPORT.md. Previous concerns addressed. No new critical issues found.
agents/opensource-sanitizer.md Read-only auditor across 6 scan categories. Git history audit uses head -20 which is ineffective for repos with many files.
agents/opensource-packager.md Generates CLAUDE.md, setup.sh, README.md, LICENSE, CONTRIBUTING.md, and GitHub templates. No git commit step; relies on orchestrator which also omits this step (see SKILL.md P1).
skills/opensource-pipeline/SKILL.md Orchestrator skill with one P1 bug: packager-generated files are never committed before GitHub publish, so the published repo is missing all documentation.

Reviews (3): Last reviewed commit: "fix: flag *.map files as dangerous inste..." | Re-trigger Greptile

- Collect GitHub org/username in Step 1, use quoted vars in publish command
- Add 3-attempt retry cap on sanitizer FAIL loop
- Use dynamic sanitization verdict in final review output
- Broaden rsync exclusions: .env*, .claude/, .secrets/, secrets/
- Fix JWT regex to match full 3-segment tokens (header.payload.signature)
- Broaden GitHub token regex to cover gho_, ghu_ prefixes
- Fix AWS regex to be case-insensitive, match env var formats
- Tighten generic env regex: increase min length to 16, add non-secret lookaheads
- Separate heuristic WARNING patterns from CRITICAL patterns in sanitizer
- Broaden internal path detection: macOS /Users/, Windows C:\Users\
- Clarify sanitizer is source-read-only (report writing is allowed)
@ecc-tools
Copy link
Copy Markdown
Contributor

ecc-tools bot commented Mar 31, 2026

Analyzing 5000 commits...

@ecc-tools
Copy link
Copy Markdown
Contributor

ecc-tools bot commented Mar 31, 2026

Analysis Failed

Not Found - https://docs.github.com/rest/git/refs#get-a-reference

Troubleshooting
Cause Resolution
Large repository Analysis may timeout on repos with extensive history
API rate limits Wait 15 minutes before retrying
Network issues Queue timeout is 15 minutes; retry may succeed
Permissions Verify app has Contents: Read access

Retry: /ecc-tools analyze


Report Issue | ECC Tools

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="agents/opensource-sanitizer.md">

<violation number="1" location="agents/opensource-sanitizer.md:88">
P2: Windows home-directory regex only matches a single character username, missing most Windows paths.</violation>
</file>

<file name="agents/opensource-forker.md">

<violation number="1" location="agents/opensource-forker.md:82">
P1: Generic env secret regex was narrowed (length >=16 plus extra exclusions), creating false negatives that can leave real secrets undetected.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

key-[A-Za-z0-9]{32}

# Generic env file secrets (WARNING — manual review, do NOT auto-strip)
^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Generic env secret regex was narrowed (length >=16 plus extra exclusions), creating false negatives that can leave real secrets undetected.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At agents/opensource-forker.md, line 82:

<comment>Generic env secret regex was narrowed (length >=16 plus extra exclusions), creating false negatives that can leave real secrets undetected.</comment>

<file context>
@@ -77,8 +78,8 @@ https://hooks\.slack\.com/services/T[A-Z0-9]+/B[A-Z0-9]+/[A-Za-z0-9]+
-# Generic env file secrets
-^[A-Z_]+=((?!true|false|yes|no|on|off|\d+$).{8,})$
+# Generic env file secrets (WARNING — manual review, do NOT auto-strip)
+^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$

</file context>


</details>

<a href="https://www.cubic.dev/action/fix/violation/9b58b615-1b99-4e2b-91f7-bb8f46d1f9d9" target="_blank" rel="noopener noreferrer" data-no-image-dialog="true">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://cubic.dev/buttons/fix-with-cubic-dark.svg">
    <source media="(prefers-color-scheme: light)" srcset="https://cubic.dev/buttons/fix-with-cubic-light.svg">
    <img alt="Fix with Cubic" src="https://cubic.dev/buttons/fix-with-cubic-dark.svg">
  </picture>
</a>

# Absolute paths to specific user home directories
pattern: /home/[a-z][a-z0-9_-]*/ (anything other than /home/user/)
pattern: /Users/[A-Za-z][A-Za-z0-9_-]*/ (macOS home directories)
pattern: C:\\Users\\[A-Za-z] (Windows home directories)
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Windows home-directory regex only matches a single character username, missing most Windows paths.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At agents/opensource-sanitizer.md, line 88:

<comment>Windows home-directory regex only matches a single character username, missing most Windows paths.</comment>

<file context>
@@ -78,7 +83,9 @@ severity: CRITICAL
-pattern: /home/[a-z]+/  (anything other than /home/user/)
+pattern: /home/[a-z][a-z0-9_-]*/  (anything other than /home/user/)
+pattern: /Users/[A-Za-z][A-Za-z0-9_-]*/  (macOS home directories)
+pattern: C:\\Users\\[A-Za-z]  (Windows home directories)
 severity: CRITICAL
 
</file context>
Suggested change
pattern: C:\\Users\\[A-Za-z] (Windows home directories)
pattern: C:\\Users\\[A-Za-z][A-Za-z0-9_-]* (Windows home directories)
Fix with Cubic

Agent(
description="Package {PROJECT} for open-source",
subagent_type="opensource-packager",
prompt="""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 {github_org} not forwarded to packager — README clone URL will be wrong

The packager agent prompt passes GitHub repo: {github_repo} (question 4, e.g. my-api) but omits {github_org} (question 3, e.g. acme-corp). The packager's README template uses both:

git clone https://github.com/{org}/{repo}.git
cd {repo}
./setup.sh

Without {github_org} in context, the packager agent has no way to fill in {org} and will either leave it as a literal placeholder or hallucinate a value — producing a broken clone URL in the generated README.

Add the org to the packager prompt:

Suggested change
prompt="""
GitHub repo: {github_repo}
GitHub org: {github_org}

Comment on lines +219 to +224

```bash
cat $HOME/opensource-staging/${PROJECT}/SANITIZATION_REPORT.md
cat $HOME/opensource-staging/${PROJECT}/FORK_REPORT.md
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Standalone /opensource package prompt is a truncated stub

The prompt for the independent package command is left as:

prompt="Package: {resolved_path} ..."

The ... is not a valid prompt — it omits every piece of context the packager agent needs to do its job: License, Description, GitHub org, and GitHub repo name. Running /opensource package my-project would invoke the packager with near-zero context, causing it to guess or produce docs full of unfilled template placeholders.

The command already asks for "License?" and "Description?" above, so those values are available. Expand the prompt to match the full-pipeline variant:

Agent(
  subagent_type="opensource-packager",
  prompt="""Generate open-source packaging for project.

Project: {resolved_path}
License: {license}
Project name: {PROJECT_NAME}
Description: {description}
GitHub org: {github_org}
GitHub repo: {github_repo}

Generate: CLAUDE.md, setup.sh, README.md (or enhance existing), LICENSE, CONTRIBUTING.md, .github/ISSUE_TEMPLATE/.
"""
)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agents/opensource-forker.md`:
- Around line 81-83: The role description "Strip all secrets, credentials, and
tokens from source files" conflicts with the regex block containing the comment
"WARNING — manual review, do NOT auto-strip" and the generic env regex; update
agents/opensource-forker.md to remove ambiguity by either 1) deleting the
highlighted regex/pattern from the forker so only agents/opensource-sanitizer.md
flags it, or 2) explicitly changing the forker's behavior around that pattern to
"flag only" (do not automatically strip) and add a clear comment describing that
matches for the regex
^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$
will be reported for manual review rather than removed; ensure the
human-readable role string "Strip all secrets, credentials, and tokens from
source files" and the in-file comment "WARNING — manual review, do NOT
auto-strip" are consistent after the change.

In `@agents/opensource-sanitizer.md`:
- Around line 22-64: Update the "Step 1: Secrets Scan (CRITICAL — any match =
FAIL)" section to include generic PASSWORD and TOKEN patterns that mirror the
forker's stripping regex so the sanitizer independently flags them; add patterns
that match keys like
(KEY|TOKEN|SECRET|PASSWORD|PASS|API_KEY|AUTH)[A-Za-z0-9_]*\s*[=:]\s*['"]?[A-Za-z0-9+/=_-]{8,}
(or separate explicit PASSWORD/TOKEN variants) into the CRITICAL list in
agents/opensource-sanitizer.md (the block under "Step 1: Secrets Scan (CRITICAL
— any match = FAIL)") so these generic credential forms are detected before the
heuristic patterns.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6911bac4-5ff8-49ca-860d-c7d0fddf37de

📥 Commits

Reviewing files that changed from the base of the PR and between 99a09d9 and 2173d52.

📒 Files selected for processing (3)
  • agents/opensource-forker.md
  • agents/opensource-sanitizer.md
  • skills/opensource-pipeline/SKILL.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • skills/opensource-pipeline/SKILL.md

Comment on lines +81 to +83
# Generic env file secrets (WARNING — manual review, do NOT auto-strip)
^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ambiguous instruction: "do NOT auto-strip" pattern in the forker agent.

The forker's role is to "Strip all secrets, credentials, and tokens from source files" (line 15), but line 82 includes a pattern marked "WARNING — manual review, do NOT auto-strip". This creates workflow ambiguity:

  • If matches should NOT be auto-stripped, why include the pattern in the forker's stripping logic?
  • If matches should only be flagged for review, that's the sanitizer's job (which has a similar WARNING pattern on line 62 in agents/opensource-sanitizer.md)

Impact: The forker might either skip valid secrets (false negative) or incorrectly strip config values (false positive) depending on how this instruction is interpreted.

♻️ Suggested clarification

Option 1: Remove this pattern from the forker entirely (let the sanitizer handle heuristic warnings):

-# Generic env file secrets (WARNING — manual review, do NOT auto-strip)
-^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$

Option 2: If the forker needs to handle this, clarify the action (strip vs flag vs both):

-# Generic env file secrets (WARNING — manual review, do NOT auto-strip)
+# Generic env file secrets (extract to .env.example, mark for manual review in FORK_REPORT.md)
 ^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Generic env file secrets (WARNING — manual review, do NOT auto-strip)
^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$
```
Suggested change
# Generic env file secrets (WARNING — manual review, do NOT auto-strip)
^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$
```
# Generic env file secrets (extract to .env.example, mark for manual review in FORK_REPORT.md)
^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/opensource-forker.md` around lines 81 - 83, The role description
"Strip all secrets, credentials, and tokens from source files" conflicts with
the regex block containing the comment "WARNING — manual review, do NOT
auto-strip" and the generic env regex; update agents/opensource-forker.md to
remove ambiguity by either 1) deleting the highlighted regex/pattern from the
forker so only agents/opensource-sanitizer.md flags it, or 2) explicitly
changing the forker's behavior around that pattern to "flag only" (do not
automatically strip) and add a clear comment describing that matches for the
regex
^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$
will be reported for manual review rather than removed; ensure the
human-readable role string "Strip all secrets, credentials, and tokens from
source files" and the in-file comment "WARNING — manual review, do NOT
auto-strip" are consistent after the change.

Comment on lines +22 to +64
### Step 1: Secrets Scan (CRITICAL — any match = FAIL)

Scan every text file (excluding `node_modules`, `.git`, `__pycache__`, `*.min.js`, `*.map`, binaries):

```
# API keys
pattern: [A-Za-z0-9_]*(api[_-]?key|apikey|api[_-]?secret)[A-Za-z0-9_]*\s*[=:]\s*['"]?[A-Za-z0-9+/=_-]{16,}

# AWS
pattern: AKIA[0-9A-Z]{16}
pattern: (?i)(aws_secret_access_key|aws_secret)\s*[=:]\s*['"]?[A-Za-z0-9+/=]{20,}

# Database URLs with credentials
pattern: (postgres|mysql|mongodb|redis)://[^:]+:[^@]+@[^\s'"]+

# JWT tokens (3-segment: header.payload.signature)
pattern: eyJ[A-Za-z0-9_-]{20,}\.eyJ[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]+

# Private keys
pattern: -----BEGIN\s+(RSA\s+|EC\s+|DSA\s+|OPENSSH\s+)?PRIVATE KEY-----

# GitHub tokens (personal, server, OAuth, user-to-server)
pattern: gh[pousr]_[A-Za-z0-9_]{36,}
pattern: github_pat_[A-Za-z0-9_]{22,}

# Google OAuth secrets
pattern: GOCSPX-[A-Za-z0-9_-]+

# Slack webhooks
pattern: https://hooks\.slack\.com/services/T[A-Z0-9]+/B[A-Z0-9]+/[A-Za-z0-9]+

# SendGrid / Mailgun
pattern: SG\.[A-Za-z0-9_-]{22}\.[A-Za-z0-9_-]{43}
pattern: key-[A-Za-z0-9]{32}
```

#### Heuristic Patterns (WARNING — manual review, does NOT auto-fail)

```
# High-entropy strings in config files
pattern: ^[A-Z_]+=[A-Za-z0-9+/=_-]{32,}$
severity: WARNING (manual review needed)
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify sanitizer can independently catch generic password/token patterns

echo "=== Checking if sanitizer patterns cover forker's broader credential patterns ==="
echo ""
echo "Forker strips: PASSWORD, PASS, TOKEN, SECRET, API_KEY, AUTH (8+ chars)"
echo "Sanitizer CRITICAL patterns:"
grep -A1 "# API keys" agents/opensource-sanitizer.md | head -2
echo ""
echo "Gap: Sanitizer lacks generic PASSWORD|TOKEN patterns in CRITICAL section"
echo "Recommendation: Add CRITICAL patterns for generic credentials"

Repository: affaan-m/everything-claude-code

Length of output: 511


🏁 Script executed:

# Get the forker pattern from line 51 and surrounding context
sed -n '40,60p' agents/opensource-forker.md

Repository: affaan-m/everything-claude-code

Length of output: 760


🏁 Script executed:

# Get full sanitizer patterns to check if PASSWORD/TOKEN are covered elsewhere
cat agents/opensource-sanitizer.md

Repository: affaan-m/everything-claude-code

Length of output: 6161


🏁 Script executed:

# Check if PASSWORD or TOKEN patterns appear anywhere in sanitizer
rg -i "password|token" agents/opensource-sanitizer.md

Repository: affaan-m/everything-claude-code

Length of output: 340


Add PASSWORD and TOKEN patterns to CRITICAL section to match forker's coverage.

The forker agent strips credentials with pattern (KEY|TOKEN|SECRET|PASSWORD|PASS|API_KEY|AUTH)[A-Za-z0-9_]*\s*[=:]\s*['\"]?[A-Za-z0-9+/=_-]{8,}, but the sanitizer's CRITICAL section only includes specific patterns (API keys, AWS, GitHub tokens, etc.) and lacks generic PASSWORD or TOKEN patterns. This violates the "never trust the forker" principle — the sanitizer should independently verify all credential types the forker strips.

Add these patterns to the CRITICAL section (before line 58):

# SendGrid / Mailgun
pattern: SG\.[A-Za-z0-9_-]{22}\.[A-Za-z0-9_-]{43}
pattern: key-[A-Za-z0-9]{32}
+
+# Generic passwords and tokens
+pattern: [A-Za-z0-9_]*(password|passwd|pass|token|auth_token)[A-Za-z0-9_]*\s*[=:]\s*['"]?[A-Za-z0-9+/=_-]{8,}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/opensource-sanitizer.md` around lines 22 - 64, Update the "Step 1:
Secrets Scan (CRITICAL — any match = FAIL)" section to include generic PASSWORD
and TOKEN patterns that mirror the forker's stripping regex so the sanitizer
independently flags them; add patterns that match keys like
(KEY|TOKEN|SECRET|PASSWORD|PASS|API_KEY|AUTH)[A-Za-z0-9_]*\s*[=:]\s*['"]?[A-Za-z0-9+/=_-]{8,}
(or separate explicit PASSWORD/TOKEN variants) into the CRITICAL list in
agents/opensource-sanitizer.md (the block under "Step 1: Secrets Scan (CRITICAL
— any match = FAIL)") so these generic credential forms are detected before the
heuristic patterns.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ecc-tools
Copy link
Copy Markdown
Contributor

ecc-tools bot commented Mar 31, 2026

Analyzing 5000 commits...

@ecc-tools
Copy link
Copy Markdown
Contributor

ecc-tools bot commented Mar 31, 2026

Analysis Failed

Not Found - https://docs.github.com/rest/git/refs#get-a-reference

Troubleshooting
Cause Resolution
Large repository Analysis may timeout on repos with extensive history
API rate limits Wait 15 minutes before retrying
Network issues Queue timeout is 15 minutes; retry may succeed
Permissions Verify app has Contents: Read access

Retry: /ecc-tools analyze


Report Issue | ECC Tools

Comment on lines +119 to +145
#### Step 5: Run Packager Agent

Spawn the `opensource-packager` agent:

```
Agent(
description="Package {PROJECT} for open-source",
subagent_type="opensource-packager",
prompt="""
Generate open-source packaging for project.

Project: {STAGING_PATH}
License: {chosen_license}
Project name: {PROJECT_NAME}
Description: {description}
GitHub repo: {github_repo}

Generate:
1. CLAUDE.md (commands, architecture, key files)
2. setup.sh (one-command bootstrap, make executable)
3. README.md (or enhance existing)
4. LICENSE
5. CONTRIBUTING.md
6. .github/ISSUE_TEMPLATE/ (bug_report.md, feature_request.md)
"""
)
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Packager files never committed — publish pushes an empty documentation repo

The packager (Step 5) writes CLAUDE.md, setup.sh, README.md, LICENSE, CONTRIBUTING.md, and .github/ISSUE_TEMPLATE/ into the staging directory, but there is no git add / git commit step anywhere between the packager completing and Step 7's gh repo create --source=. --push.

The gh repo create --source=. --push command pushes only already-committed content. The forker's initial commit (Step 6 of opensource-forker.md) only includes the sanitized source files — it runs before the packager even starts. All of the packager's output is left as untracked files in the working tree and is silently omitted from the GitHub push.

Result: the published repo gets the raw sanitized source with no CLAUDE.md, no setup.sh, no LICENSE, no CONTRIBUTING.md — exactly the files that make the project "immediately usable with Claude Code."

A commit step is needed between Step 5 and Step 7, e.g. added to the end of Step 5 in the skill:

# After packager agent completes:
cd "{STAGING_PATH}"
git add -A
git commit -m "Add open-source documentation and setup files

Generated by opensource-packager: CLAUDE.md, setup.sh, README.md,
LICENSE, CONTRIBUTING.md, and GitHub issue templates."

Alternatively, add this git commit step to the end of the packager agent's own workflow.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
agents/opensource-sanitizer.md (1)

22-56: ⚠️ Potential issue | 🟠 Major

Add PASSWORD and TOKEN patterns to CRITICAL section for independent verification.

The sanitizer lacks explicit PASSWORD and TOKEN credential patterns that the forker strips (forker line 51: (KEY|TOKEN|SECRET|PASSWORD|PASS|API_KEY|AUTH)). The sanitizer should independently verify all credential types without trusting the forker's work, per the "never trust the forker" principle (line 10).

Impact: The sanitizer may miss generic passwords or tokens that slipped through or were incorrectly handled by the forker, violating the defense-in-depth model.

🛡️ Proposed fix

Add these patterns before the Heuristic Patterns section:

 # SendGrid / Mailgun
 pattern: SG\.[A-Za-z0-9_-]{22}\.[A-Za-z0-9_-]{43}
 pattern: key-[A-Za-z0-9]{32}
+
+# Generic passwords and tokens
+pattern: [A-Za-z0-9_]*(password|passwd|pass|token|auth_token)[A-Za-z0-9_]*\s*[=:]\s*['"]?[A-Za-z0-9+/=_-]{8,}

Heuristic Patterns (WARNING — manual review, does NOT auto-fail)

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @agents/opensource-sanitizer.md around lines 22 - 56, The CRITICAL "Step 1:
Secrets Scan" section is missing explicit PASSWORD and TOKEN patterns; add regex
checks for generic password and token credentials (e.g., patterns matching
common substrings like PASSWORD, PASS, TOKEN combined with separators and long
alphanumerics) into the CRITICAL block (the same block that contains API keys,
AKIA, ghp/github_pat, etc.) so the sanitizer independently detects those items
regardless of the forker's (KEY|TOKEN|SECRET|PASSWORD|PASS|API_KEY|AUTH)
stripping logic; insert these new PASSWORD and TOKEN patterns before the
"Heuristic Patterns" section to enforce fail-on-match behavior.


</details>

</blockquote></details>
<details>
<summary>agents/opensource-forker.md (1)</summary><blockquote>

`81-83`: _⚠️ Potential issue_ | _🟠 Major_

**Resolve the "do NOT auto-strip" instruction ambiguity.**

The forker's role explicitly states "Strip all secrets, credentials, and tokens from source files" (line 15), yet this pattern includes a comment "WARNING — manual review, do NOT auto-strip". This creates operational ambiguity that could lead to inconsistent secret handling.

**Impact:** The forker agent may either skip legitimate secrets (false negative) or incorrectly strip configuration values (false positive) depending on how this instruction is interpreted during execution.

**Resolution options:**
1. **Remove from forker** (preferred): Delete this pattern entirely and let the sanitizer agent handle heuristic warnings in its separate WARNING category
2. **Clarify behavior**: Change the instruction to explicitly describe what action the forker should take (e.g., "flag in FORK_REPORT.md but do not strip")



<details>
<summary>♻️ Preferred solution: Remove from forker</summary>

```diff
-# Generic env file secrets (WARNING — manual review, do NOT auto-strip)
-^[A-Z_]+=((?!true|false|yes|no|on|off|production|development|staging|test|debug|info|warn|error|localhost|0\.0\.0\.0|127\.0\.0\.1|\d+$).{16,})$

Let the sanitizer agent (which has a similar heuristic pattern in its WARNING section) handle this detection independently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/opensource-forker.md` around lines 81 - 83, The "# Generic env file
secrets (WARNING — manual review, do NOT auto-strip)" pattern in
agents/opensource-forker.md creates ambiguity with the forker's "Strip all
secrets" role; remove this entire pattern block from the forker file so the
sanitizer agent handles heuristic WARNING detections instead, or if you prefer
not to remove it, change the comment to an explicit directive (e.g., "flag in
FORK_REPORT.md; do not auto-strip") so the forker's behavior is
unambiguous—update the text around the header and the regex accordingly.
🧹 Nitpick comments (1)
agents/opensource-forker.md (1)

60-61: Consider adding minimum segment lengths to the JWT pattern.

The current 3-segment JWT pattern uses + quantifiers that could match very short base64url strings. Adding minimum lengths (e.g., {10,}) would reduce false positives and better match real JWTs.

♻️ Suggested enhancement
-# JWT tokens (3-segment: header.payload.signature)
-eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+
+# JWT tokens (3-segment: header.payload.signature)
+eyJ[A-Za-z0-9_-]{10,}\.eyJ[A-Za-z0-9_-]{10,}\.[A-Za-z0-9_-]{10,}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/opensource-forker.md` around lines 60 - 61, The JWT regex pattern
"eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+" is too permissive; update
it to require minimum segment lengths to reduce false positives (e.g., replace
the "+" quantifiers with bounded repeats like "{10,}" or another sensible
minimum) for the header, payload and signature segments so the pattern only
matches likely real JWTs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agents/opensource-sanitizer.md`:
- Around line 37-38: The JWT regex in agents/opensource-sanitizer.md uses
inconsistent segment length quantifiers: the first two segments use {20,} while
the signature uses +; update the pattern variable (pattern) so the third segment
uses the same minimum length quantifier (e.g., change \.[A-Za-z0-9_-]+ to
\.[A-Za-z0-9_-]{20,}) or choose a consistent, justified length for all three
segments to avoid matching malformed tokens.

---

Duplicate comments:
In `@agents/opensource-forker.md`:
- Around line 81-83: The "# Generic env file secrets (WARNING — manual review,
do NOT auto-strip)" pattern in agents/opensource-forker.md creates ambiguity
with the forker's "Strip all secrets" role; remove this entire pattern block
from the forker file so the sanitizer agent handles heuristic WARNING detections
instead, or if you prefer not to remove it, change the comment to an explicit
directive (e.g., "flag in FORK_REPORT.md; do not auto-strip") so the forker's
behavior is unambiguous—update the text around the header and the regex
accordingly.

In `@agents/opensource-sanitizer.md`:
- Around line 22-56: The CRITICAL "Step 1: Secrets Scan" section is missing
explicit PASSWORD and TOKEN patterns; add regex checks for generic password and
token credentials (e.g., patterns matching common substrings like PASSWORD,
PASS, TOKEN combined with separators and long alphanumerics) into the CRITICAL
block (the same block that contains API keys, AKIA, ghp/github_pat, etc.) so the
sanitizer independently detects those items regardless of the forker's
`(KEY|TOKEN|SECRET|PASSWORD|PASS|API_KEY|AUTH)` stripping logic; insert these
new PASSWORD and TOKEN patterns before the "Heuristic Patterns" section to
enforce fail-on-match behavior.

---

Nitpick comments:
In `@agents/opensource-forker.md`:
- Around line 60-61: The JWT regex pattern
"eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+" is too permissive; update
it to require minimum segment lengths to reduce false positives (e.g., replace
the "+" quantifiers with bounded repeats like "{10,}" or another sensible
minimum) for the header, payload and signature segments so the pattern only
matches likely real JWTs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 65aa99d5-c9c1-4a75-bf19-3dea0dcba338

📥 Commits

Reviewing files that changed from the base of the PR and between 2173d52 and 90ccb96.

📒 Files selected for processing (2)
  • agents/opensource-forker.md
  • agents/opensource-sanitizer.md

Comment on lines +37 to +38
# JWT tokens (3-segment: header.payload.signature)
pattern: eyJ[A-Za-z0-9_-]{20,}\.eyJ[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]+
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent JWT pattern segment length requirements.

The JWT pattern specifies {20,} for the first two segments but uses + (1 or more) for the third segment. This inconsistency could match malformed tokens.

🔧 Suggested fix
-# JWT tokens (3-segment: header.payload.signature)
-pattern: eyJ[A-Za-z0-9_-]{20,}\.eyJ[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]+
+# JWT tokens (3-segment: header.payload.signature)
+pattern: eyJ[A-Za-z0-9_-]{20,}\.eyJ[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{20,}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# JWT tokens (3-segment: header.payload.signature)
pattern: eyJ[A-Za-z0-9_-]{20,}\.eyJ[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]+
# JWT tokens (3-segment: header.payload.signature)
pattern: eyJ[A-Za-z0-9_-]{20,}\.eyJ[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{20,}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/opensource-sanitizer.md` around lines 37 - 38, The JWT regex in
agents/opensource-sanitizer.md uses inconsistent segment length quantifiers: the
first two segments use {20,} while the signature uses +; update the pattern
variable (pattern) so the third segment uses the same minimum length quantifier
(e.g., change \.[A-Za-z0-9_-]+ to \.[A-Za-z0-9_-]{20,}) or choose a consistent,
justified length for all three segments to avoid matching malformed tokens.

@affaan-m affaan-m merged commit 477d23a into affaan-m:main Mar 31, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants