This repo is intended to be runnable locally and easy for coding agents to work in.
Always make sure you are working on top of latest main from remote. Especially in detached worktrees, fetch origin/main and rebase or branch from it before editing.
Style Telegraph. Drop filler/grammar. Min tokens (global AGENTS + replies).
Critical Thinking Fix root cause (not band-aid). Unsure: read more code; if still stuck, ask w/ short options. Unrecognized changes: assume other agent; keep going; focus your changes. If it causes issues, stop + ask user. Leave breadcrumb notes in thread.
Attribution NEVER add links to Claude sessions in PR body or commits. Also never attribute commit or merge commit to coding agents, always use real user.
Before committing, configure git user from environment variables:
git config user.name "$GIT_USER_NAME"
git config user.email "$GIT_USER_EMAIL"GIT_USER_NAME and GIT_USER_EMAIL must be set in the session.
- Keep decisions as comments on top of the file. Only important decisions that could not be inferred from code.
- Code should be easily testable, smoke testable, runnable in local dev env.
- Prefer small, incremental PR-sized changes with a runnable state at each step.
- Avoid adding dependencies with non-permissive licenses. If a dependency is non-permissive or unclear, stop and ask the repo owner.
AI-friendly web content fetching tool designed for LLM consumption. Rust library with CLI, MCP server, and Python bindings.
Key capabilities:
- HTTP fetching (GET/HEAD) with streaming support
- HTML-to-Markdown and HTML-to-Text conversion optimized for LLMs
- Binary content detection (returns metadata only)
- Timeout handling with partial content on timeout
- URL filtering via allow/block lists
- MCP server for AI tool integration
specs/ folder contains feature specifications outlining requirements for specific features and components. New code should comply with these specifications or propose changes to them.
Available specs:
specs/initial.md- WebFetch tool specification (types, behavior, conversions, error handling)specs/fetchers.md- Pluggable fetcher system for URL-specific handlingspecs/release-process.md- Agent-driven release and publish workflowspecs/maintenance.md- Periodic maintenance checklist (deps, docs, spec-code alignment)specs/threat-model.md- Security threat model (SSRF, network, input validation, DoS)specs/bot-auth.md- Web Bot Authentication (draft-meunier-web-bot-auth-architecture)
Specification format: Abstract and Requirements sections.
Implement → test → /ship. The /ship command (.claude/commands/ship.md) runs a 10-phase workflow: pre-flight, test coverage, code simplification, security review, artifact updates, smoke testing, quality gates, push+PR, CI wait+merge, post-merge report.
Phases 2–6 (tests, simplification, security, artifacts, smoke) are the quality core — never skip.
When asked to "fix and ship": implement fix first, then run /ship.
.claude/skills/ contains development skills following the Agent Skills Specification.
Available skills:
/ship— 10-phase shipping workflow (.claude/commands/ship.md)/processing-issues— Batch-process GitHub issues: triage, implement, ship via individual PRs (.claude/commands/processing-issues.md)/process-issues— Resolve all open GitHub issues e2e; one issue = one shipped PR (.claude/skills/process-issues/SKILL.md)
.agents/ mirrors .claude/ via symlinks for agent-agnostic access:
.agents/commands/→.claude/commands/.agents/skills/→.claude/skills/
docs/ contains public-facing user documentation. This documentation is intended for end users and operators of the system, not for internal development reference.
When making changes that affect user-facing behavior or operations, update the relevant docs in this folder.
Requirements:
- Rust stable toolchain (rustup recommended)
- cargo for building and testing
Quick start:
cargo build --workspace --exclude fetchkit-python # Build default Rust artifacts
cargo test --workspace # Run all tests
cargo run -p fetchkit-cli -- --help # Run CLINote: fetchkit-python currently requires a separate Python link environment and is
not part of the default release-build smoke path.
crates/
├── fetchkit/ # Core library - types, fetch logic, HTML conversion
├── fetchkit-cli/ # CLI binary and MCP server
└── fetchkit-python/ # Python bindings (PyO3)
specs/ # Feature specifications
- Crate names:
fetchkit,fetchkit-cli,fetchkit-python - Types: PascalCase (
WebFetchRequest,WebFetchResponse) - Functions: snake_case (
fetch,html_to_markdown) - Constants: SCREAMING_SNAKE_CASE
- CI is implemented using GitHub Actions.
- Jobs: lint, test, build, doc, examples, check
checkis the branch-protection gate and must stay green- All jobs must pass before merging
- Clippy runs with
-D warnings(warnings are errors) - Doc builds must not have warnings
See specs/release-process.md for the release contract.
Quick summary:
- Human asks agent: "Create release v0.2.0"
- Agent updates CHANGELOG.md (with Highlights + What's Changed), Cargo.toml version, creates PR
- Human reviews and merges PR to main
- CI creates GitHub Release via
softprops/action-gh-release(release.yml) - GitHub Release publication triggers
publish.yml - CI publishes
fetchkitthenfetchkit-clito crates.io
Workflows:
.github/workflows/release.yml- Creates GitHub Release on merge or manual dispatch.github/workflows/publish.yml- Publishes to crates.io on GitHub Release or manual dispatch
Requirements:
CARGO_REGISTRY_TOKENsecret must be configured in repo settings
Note: fetchkit-python is not published to crates.io (publish = false). Python release automation is not configured in this repo yet.
Use Doppler for all secret-backed commands in cloud agents.
./scripts/init-cloud-env.shDisable incremental compilation in cloud (saves ~3 GB, useless for single builds):
export CARGO_INCREMENTAL=0All cloud secrets are in Doppler (GITHUB_TOKEN). Project: everruns-dev, config: dev.
For GitHub CLI, map token explicitly:
doppler run -- bash -lc 'GH_TOKEN="$GITHUB_TOKEN" gh auth status'Quickcheck:
doppler run -- env | rg 'GITHUB_TOKEN'
doppler run -- bash -lc 'GH_TOKEN="$GITHUB_TOKEN" gh auth status'Before creating a pull request, ensure:
-
Branch rebased: Rebase on latest main to avoid merge conflicts. In detached worktrees, first create or switch to a topic branch that is based on
origin/main.git fetch origin main && git rebase origin/main -
Formatting: Run formatter and fix any issues
cargo fmt --all
-
Linting: Run clippy and fix all warnings
cargo clippy --workspace --all-targets -- -D warnings
-
Tests: Run all tests and ensure they pass
cargo test --workspace -
Documentation: Ensure docs build without warnings
RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps -
Release build smoke: Ensure the publishable Rust crates build in release mode
cargo build --workspace --exclude fetchkit-python --release
-
CI green: All CI checks must pass before merging
-
PR comments resolved: No unaddressed review comments in PR
-
Specs: If changes affect system behavior, update specs in
specs/ -
Docs: If changes affect usage or configuration, update public docs in
docs/
CI will fail if formatting, linting, tests, release build smoke, or doc build fail. Always run these locally before pushing.
Follow Conventional Commits for all commit messages:
<type>[optional scope]: <description>
[optional body]
[optional footer(s)]
Types:
feat: New featurefix: Bug fixdocs: Documentation changesstyle: Code style (formatting, semicolons, etc.)refactor: Code refactoring without feature/fixperf: Performance improvementstest: Adding or updating testschore: Build process, dependencies, toolingci: CI configuration changes
Examples:
feat(api): add agent versioning endpoint
fix(workflow): handle timeout in run execution
docs: update API documentation
refactor(db): simplify connection pooling
Validation (optional):
# Validate a commit message
echo "feat: add new feature" | npx commitlint
# Validate last commit
npx commitlint --from HEAD~1 --to HEADPR titles should follow Conventional Commits format. Use the PR template (.github/pull_request_template.md) for descriptions.
PR Body Template:
## What
Clear description of the change.
## Why
Problem or motivation.
## How
High-level approach.
## Risk
- Low / Medium / High
- What can break
### Checklist
- [ ] Unit tests are passed
- [ ] Smoke tests are passed
- [ ] Documenation is updated
- [ ] Specs are up to date and not in conflict
- [ ] ... other check list items# Run all tests
cargo test --workspace
# Run tests with output
cargo test --workspace -- --nocapture
# Run specific test
cargo test --workspace test_name
# Test CLI directly
cargo run -p webfetch-cli -- --url https://example.com --as-markdown
# Test MCP server
cargo run -p webfetch-cli -- mcpTests use wiremock for HTTP mocking (no real external network calls). See specs/initial.md for test requirements.