Download GitHub pull request data and convert it to a markdown document that mirrors what a user sees on the GitHub pull request page, making it available both as a command-line tool and a library. The output should be fully available offline in a self-contained folder with all referenced assets.
The application must be available as a command-line tool that can be invoked with:
gh-load-pull-request <pr-url> [options]Required Options:
-o, --output <dir>- Output directory for PR data (createspr-<number>/subfolder)-t, --token <token>- GitHub personal access token (optional for public PRs)--format <format>- Output format:markdown(default) orjson--download-images- Download embedded images (default: true)--include-reviews- Include PR reviews (default: true)-v, --verbose- Enable verbose logging-h, --help- Show help--version- Show version number
The application must be importable as an ES module:
import {
loadPullRequest,
parsePrUrl,
convertToMarkdown,
} from 'gh-load-pull-request';
// Fetch PR data
const data = await loadPullRequest({
owner: 'facebook',
repo: 'react',
prNumber: 28000,
token: 'ghp_xxx', // optional for public PRs
includeReviews: true,
});
// Convert to markdown
const { markdown, downloadedImages } = await convertToMarkdown(data, {
downloadImagesFlag: true,
imagesDir: './pr-28000-images',
});The markdown output must include all data visible on a GitHub PR page, in the following order (matching the GitHub UI):
- Title - PR title as heading
- Metadata Block
- Author with link
- State (open/closed/merged)
- Created/Updated/Merged/Closed dates
- Base and head branch information
- Labels (if any)
- Assignees (if any)
- Reviewers with approval status
- Milestone (if any)
- Linked issues (if any)
- Stats (+additions/-deletions, changed files count)
- Description - Full PR body/description
- Conversation Timeline - Chronological list of:
- Comments (issue comments)
- Reviews with their comments
- Review comments on specific lines
- Commit events
- Label changes
- Milestone changes
- Assignment changes
- Branch events (merge, delete)
- Cross-references from other PRs/issues
- Commits - List of all commits with SHA, message, author, and link
- Files Changed - List of changed files with status icon and stats
When saving to a directory, the output must be fully self-contained:
pr-<number>/
pr-<number>.md # Main markdown file
pr-<number>.json # JSON metadata file
images/ # Downloaded images
image-1.png
image-2.jpg
...
diffs/ # File diffs (optional, for full offline view)
file-1.diff
file-2.diff
...
All image URLs in markdown must be rewritten to use relative paths pointing to the images/ folder.
- Automatically detect and download images from:
- PR description
- Comments
- Reviews
- Review comments
- Validate downloaded images using magic bytes
- Skip invalid/corrupted downloads with warning
- Support common formats: PNG, JPG, GIF, WebP, SVG, BMP, ICO
Support multiple authentication methods (in priority order):
--tokencommand line argumentGITHUB_TOKENenvironment variable- GitHub CLI (
gh auth token) if installed - No authentication (public PRs only)
-
Unit Tests - Test core functions:
- URL parsing for all supported formats
- Markdown conversion
- Image extraction and validation
- JSON output format
-
Integration Tests - Test with mock GitHub API responses
-
E2E Tests - Test against real GitHub PRs:
- Simple PR with minimal content
- Complex PR with images, reviews, and many comments
- PR with code review comments on specific lines
- Merged PR with full history
- Cross-platform testing (Linux, macOS, Windows)
The following PRs should be used for comprehensive testing:
- Simple PR:
link-foundation/gh-load-pull-request#2- Minimal content - Complex PR with reviews:
facebook/react#28000- Multiple reviewers, comments - Large PR with many files: Find a suitable example with 10+ changed files
- PR with images: Find an example with embedded screenshots
- Run linting (ESLint)
- Run formatting check (Prettier)
- Run all tests on Ubuntu, macOS, and Windows
- Enforce changeset for version tracking
- Performance - Handle PRs with hundreds of comments efficiently
- Error Handling - Graceful degradation when:
- Rate limits are hit
- Images fail to download
- Authentication fails
- Compatibility - Work with Bun >= 1.2.0 runtime
- Output Quality - Markdown should be:
- Valid GitHub-flavored markdown
- Readable without rendering (plain text)
- Properly formatted with consistent spacing