Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
160 changes: 160 additions & 0 deletions .github/workflows/grug-variant-diff.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
name: Grug Variant Diff

on:
pull_request:
branches: [main]
paths:
- experiments/grug/**
- scripts/grug_dir_diff.py
- scripts/grug_variant_diff_ci.py
- .github/workflows/grug-variant-diff.yaml

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

permissions:
contents: write
pull-requests: write

jobs:
grug-variant-diff:
runs-on: ubuntu-latest
timeout-minutes: 15

steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"

- name: Generate variant diff reports
run: |
python -m scripts.grug_variant_diff_ci \
--base-sha "${{ github.event.pull_request.base.sha }}" \
--head-sha "${{ github.event.pull_request.head.sha }}" \
--output-dir grug-diff-report \
--manifest-path grug-diff-report/manifest.json

- name: Read manifest metadata
id: manifest
run: |
python - <<'PY'
import json
import os
from pathlib import Path

manifest = json.loads(Path("grug-diff-report/manifest.json").read_text(encoding="utf-8"))
report_count = int(manifest.get("report_count", 0))
has_reports = "true" if report_count > 0 else "false"

with Path(os.environ["GITHUB_OUTPUT"]).open("a", encoding="utf-8") as output:
output.write(f"report_count={report_count}\n")
output.write(f"has_reports={has_reports}\n")
PY

- name: Upload report artifact
id: artifact
if: steps.manifest.outputs.has_reports == 'true'
uses: actions/upload-artifact@v4
with:
name: grug-variant-diff-pr-${{ github.event.pull_request.number }}
path: grug-diff-report
retention-days: 14

- name: Publish rendered report to gh-pages
id: publish
if: steps.manifest.outputs.has_reports == 'true' && github.event.pull_request.head.repo.full_name == github.repository
uses: peaceiris/actions-gh-pages@v4
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: grug-diff-report
destination_dir: grug-diffs/pr-${{ github.event.pull_request.number }}
keep_files: true

- name: Post PR comment with report links
if: github.event.pull_request.head.repo.full_name == github.repository
uses: actions/github-script@v7
env:
ARTIFACT_URL: ${{ steps.artifact.outputs.artifact-url }}
PAGES_URL: https://${{ github.repository_owner }}.github.io/${{ github.event.repository.name }}/grug-diffs/pr-${{ github.event.pull_request.number }}
PAGES_PUBLISHED: ${{ steps.publish.outcome == 'success' }}
with:
script: |
const fs = require('fs');
const marker = '<!-- grug-variant-diff-report -->';
const manifest = JSON.parse(fs.readFileSync('grug-diff-report/manifest.json', 'utf8'));
const reportCount = Number(manifest.report_count || 0);
const pagesUrl = process.env.PAGES_URL;
const pagesPublished = process.env.PAGES_PUBLISHED === 'true';
const artifactUrl = process.env.ARTIFACT_URL || '';

const { owner, repo } = context.repo;
const issue_number = context.issue.number;
const comments = await github.paginate(github.rest.issues.listComments, {
owner,
repo,
issue_number,
per_page: 100,
});

const existing = comments.find((comment) => comment.body && comment.body.includes(marker));

if (reportCount === 0) {
if (existing) {
await github.rest.issues.deleteComment({
owner,
repo,
comment_id: existing.id,
});
}
return;
}

let body = '🤖 Grug variant diff report\n';
body += `${marker}\n\n`;
if (!pagesPublished) {
body += 'Rendered GitHub Pages links were not published for this run.\n\n';
}

body += '| New Variant | Closest Existing Variant | Distance Score | Diff |\n';
body += '|---|---|---:|---|\n';

for (const report of manifest.reports) {
const variant = report.variant;
const closest = report.closest_variant;
const score = report.distance_score;
const link = pagesPublished
? `[Open](${pagesUrl}/${variant}/index.html)`
: `artifact:\`${report.report_relpath}\``;
body += `| \`${variant}\` | \`${closest}\` | ${score} | ${link} |\n`;
}

if (artifactUrl) {
body += `\nArtifact fallback: [Download report bundle](${artifactUrl})\n`;
}

if (!pagesPublished) {
body += '\nFor fork PRs, this workflow usually cannot publish to `gh-pages`; artifact link remains available.';
}

if (existing) {
await github.rest.issues.updateComment({
owner,
repo,
comment_id: existing.id,
body,
});
} else {
await github.rest.issues.createComment({
owner,
repo,
issue_number,
body,
});
}
17 changes: 17 additions & 0 deletions docs/recipes/change_grug.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,26 @@ Keep each pass scoped to one bucket:
Update `docs/reports/grug-archive.md` with:

- path
- origin (`base`, `moe`, or another source variant)
- commit SHA (when known)
- purpose
- status (`active`, `superseded`, `deleted`)
- diff link (prefer the CI-posted PR comment link; fallback to local report path)

For PRs that add a new `experiments/grug/<variant>/`, CI posts a visual diff
comment automatically. Copy that link into the archive entry.

If you need a local fallback, generate the diff report manually:

```bash
uv run python scripts/grug_dir_diff.py \
experiments/grug/base \
experiments/grug/<variant> \
--out /tmp/grug-diff
```

Then include a link to the report in the archive entry so reviewers can inspect
template-copy changes quickly.

### 4) Upstream to base if it wins

Expand Down
26 changes: 26 additions & 0 deletions experiments/grug/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,32 @@ uv run lib/marin/src/marin/run/ray_run.py \
-- python experiments/grug/base/launch.py
```

## Visual diff for template variants

When template-copying `experiments/grug/base/` to a new variant, use the HTML diff tool to review changes:

```bash
uv run python scripts/grug_dir_diff.py \
experiments/grug/base \
experiments/grug/<variant> \
--out /tmp/grug-diff
```

What it does:

- Recursively compares both directories (default extensions: `.py`, `.md`)
- Builds one `/tmp/grug-diff/index.html` report with top-level summary + file table
- Renders inline side-by-side diffs on that same page (changed/added/removed by default)
- PRs that add a new `experiments/grug/<variant>/` also get this diff posted automatically by CI.

Useful flags:

- `--extensions .py,.md`: restrict file types
- `--all-files`: include everything instead of extension filtering
- `--show-unchanged`: generate pages for unchanged files too
- `--context-lines 5`: tune context around edits
- `--no-open`: generate the report without launching a browser

## Common edit points

- Architecture changes: `experiments/grug/base/model.py`
Expand Down
Loading