Highlights
- Pluggable fetchers for GitHub, Wikipedia, YouTube, ArXiv, StackOverflow, HackerNews, RSS, package registries, docs sites, and Twitter
- Batch fetching for concurrent multi-URL requests
- Content-focused extraction with boilerplate stripping and structured metadata
- Conditional fetching with ETag and If-Modified-Since support
- Improved HTML-to-Markdown conversion quality
- Content quality signals: word count, redirect chain, paywall detection
- Optional Web Bot Authentication support
- Hardened outbound fetch policy with proxy isolation and SSRF mitigations
- Live integration test suite behind feature flag
Breaking Changes
- Ambient proxy environment variables are now ignored by default; set them explicitly if needed
What's Changed
- test(fetchers): add live integration tests behind feature flag (#84)
- chore: periodic maintenance — deps update and spec sync (#83)
- feat(fetch): add content quality signals (word_count, redirect_chain, is_paywall) (#82)
- feat(client): add batch_fetch for concurrent multi-URL fetching (#81)
- feat(fetch): add conditional fetching with ETag and If-Modified-Since (#80)
- feat(convert): improve HTML-to-Markdown conversion quality (#79)
- feat(convert): add content-focused extraction with boilerplate stripping (#78)
- feat(convert): add structured metadata extraction from HTML pages (#77)
- feat(fetchers): add RSSFeedFetcher for structured feed parsing (#70)
- feat(fetchers): add HackerNewsFetcher for structured thread extraction (#69)
- feat(fetchers): add ArXivFetcher for paper metadata and abstract (#68)
- feat(fetchers): add YouTubeFetcher for video metadata extraction (#67)
- feat(fetchers): add WikipediaFetcher for article extraction (#66)
- feat(fetchers): add PackageRegistryFetcher for PyPI, crates.io, npm (#65)
- feat(fetchers): add StackOverflowFetcher for clean Q&A extraction (#64)
- feat(fetchers): add DocsSiteFetcher with llms.txt support (#63)
- feat(fetchers): add GitHubCodeFetcher for source file fetching (#62)
- feat(fetchers): add GitHubIssueFetcher for structured issue/PR fetching (#61)
- feat: add process-issues skill for e2e GitHub issue resolution (#60)
- feat: add optional Web Bot Authentication support (#49)
- feat(fetchers): add TwitterFetcher for tweet URL handling (#47)
- feat: skip HTML conversion for non-HTML responses (#48)
- chore(deps): update workspace dependencies and fix flaky proxy tests (#46)
- feat(toolkit): align fetchkit with toolkit library contract (#45)
- fix(security): harden outbound fetch policy (#43)
- docs: clarify latest-main requirement for worktrees (#44)
- fix(security): isolate proxy env in shared runtimes (#42)
- fix(security): block IPv4-compatible and 6to4 IPv6 addresses in SSRF protection (#41)
- fix(security): sanitize reqwest error messages to prevent hostname leakage (#40)
- fix: resolve threat model issues (#37)
Full Changelog: v0.1.3...v0.2.0