diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml new file mode 100644 index 0000000..dbbceed --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -0,0 +1,101 @@ +name: Bug Report +description: Report a bug or unexpected behavior +labels: ["bug", "triage"] +body: + - type: markdown + attributes: + value: | + Thank you for taking the time to report a bug! + Please fill out the sections below to help us understand and reproduce the issue. + + - type: textarea + id: description + attributes: + label: Bug Description + description: A clear and concise description of the bug + placeholder: Describe what went wrong... + validations: + required: true + + - type: textarea + id: reproduction + attributes: + label: Steps to Reproduce + description: Steps to reproduce the behavior + placeholder: | + 1. Parse this feed: `...` + 2. Access this field: `...` + 3. See error + validations: + required: true + + - type: textarea + id: expected + attributes: + label: Expected Behavior + description: What did you expect to happen? + placeholder: Describe what you expected... + validations: + required: true + + - type: textarea + id: feed-sample + attributes: + label: Feed Sample + description: | + If applicable, provide a minimal feed sample that demonstrates the issue. + Please remove any sensitive information. + render: xml + placeholder: | + + + + Example + + + + validations: + required: false + + - type: dropdown + id: platform + attributes: + label: Platform + description: Which platform/binding are you using? + options: + - Rust (feedparser-rs-core) + - Node.js (feedparser-rs npm) + - Python (feedparser-rs-py) + validations: + required: true + + - type: input + id: version + attributes: + label: Version + description: Which version of feedparser-rs are you using? + placeholder: "0.1.0" + validations: + required: true + + - type: textarea + id: environment + attributes: + label: Environment + description: | + Please provide relevant environment information: + placeholder: | + - OS: macOS 14.0 + - Rust: 1.88.0 + - Node.js: 20.10.0 (if applicable) + - Python: 3.12 (if applicable) + validations: + required: false + + - type: textarea + id: additional + attributes: + label: Additional Context + description: Any other context about the problem (logs, screenshots, related issues) + validations: + required: false diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 0000000..feef4ed --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,8 @@ +blank_issues_enabled: false +contact_links: + - name: Question or Discussion + url: https://github.com/bug-ops/feedparser-rs/discussions + about: Ask questions or start a discussion about feedparser-rs + - name: Security Vulnerability + url: https://github.com/bug-ops/feedparser-rs/security/advisories/new + about: Report a security vulnerability (please do not use public issues) diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml new file mode 100644 index 0000000..81a0b5d --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.yml @@ -0,0 +1,87 @@ +name: Feature Request +description: Suggest a new feature or enhancement +labels: ["enhancement"] +body: + - type: markdown + attributes: + value: | + Thank you for suggesting a feature! + Please describe your idea clearly so we can evaluate and prioritize it. + + - type: textarea + id: problem + attributes: + label: Problem Statement + description: | + Is your feature request related to a problem? Please describe. + A clear description of what the problem is. + placeholder: I'm always frustrated when... + validations: + required: true + + - type: textarea + id: solution + attributes: + label: Proposed Solution + description: A clear and concise description of what you want to happen + placeholder: It would be great if feedparser-rs could... + validations: + required: true + + - type: textarea + id: alternatives + attributes: + label: Alternatives Considered + description: Any alternative solutions or workarounds you've considered + placeholder: I've tried... but it doesn't work because... + validations: + required: false + + - type: dropdown + id: platform + attributes: + label: Affected Platform + description: Which platform(s) would benefit from this feature? + multiple: true + options: + - Rust (feedparser-rs-core) + - Node.js (feedparser-rs npm) + - Python (feedparser-rs-py) + - All platforms + validations: + required: true + + - type: dropdown + id: feedparser-compat + attributes: + label: feedparser Compatibility + description: Is this feature present in Python feedparser? + options: + - Yes, feedparser has this feature + - No, this is a new feature + - Not sure + validations: + required: true + + - type: textarea + id: api + attributes: + label: Proposed API + description: | + If applicable, describe or show what the API might look like + render: rust + placeholder: | + // Example API usage + let feed = parse_with_options(xml, Options { + new_feature: true, + })?; + validations: + required: false + + - type: textarea + id: additional + attributes: + label: Additional Context + description: Any other context, links, or screenshots about the feature request + validations: + required: false diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 0000000..cc2b00f --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,50 @@ +## Summary + + + +## Motivation + + + +Fixes # + +## Changes + + + +- + +## Type of Change + + + +- [ ] Bug fix (non-breaking change that fixes an issue) +- [ ] New feature (non-breaking change that adds functionality) +- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) +- [ ] Documentation update +- [ ] Refactoring (no functional changes) +- [ ] Performance improvement +- [ ] Test addition or update + +## Test Plan + + + +- [ ] Ran `cargo make test` (all tests pass) +- [ ] Ran `cargo make lint` (no warnings) +- [ ] Added new tests for the changes +- [ ] Tested manually with: + +## Checklist + +- [ ] My code follows the project's style guidelines +- [ ] I have performed a self-review of my code +- [ ] I have commented my code where necessary +- [ ] I have updated the documentation accordingly +- [ ] My changes generate no new warnings +- [ ] I have added tests that prove my fix/feature works +- [ ] New and existing tests pass locally + +## Additional Notes + + diff --git a/CHANGELOG.md b/CHANGELOG.md index dfb240b..dd21ce1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,30 +8,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] ### Added -- Node.js bindings via napi-rs -- npm package `feedparser-rs` -- Criterion benchmarks for Rust -- CI/CD pipeline with GitHub Actions -- Cross-platform builds (Linux, macOS, Windows) -- TypeScript definitions -- Comprehensive Node.js test suite -- Benchmark comparison infrastructure -- Python feedparser benchmark baseline +- HTTP bindings for URL fetching with `http` feature +- `parse_url` and `parse_url_with_limits` functions +- Conditional GET support (ETag, Last-Modified) for bandwidth-efficient caching +- Automatic compression handling (gzip, deflate, brotli) +- Node.js `fetchAndParse` async function for URL fetching +- Podcast namespace support (iTunes and Podcast 2.0) +- CONTRIBUTING.md guide +- Improved README with GitHub callouts and better structure ### Changed -- N/A - -### Deprecated -- N/A - -### Removed -- N/A - -### Fixed -- N/A - -### Security -- N/A +- Default features now include `http` for URL fetching support +- Migrated to cargo-make for task automation ## [0.1.0] - 2025-12-14 diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..6a6662a --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,223 @@ +# Contributing to feedparser-rs + +Thank you for your interest in contributing to feedparser-rs! This document provides guidelines and instructions for contributing. + +## Code of Conduct + +This project follows the [Rust Code of Conduct](https://www.rust-lang.org/policies/code-of-conduct). Please be respectful and constructive in all interactions. + +## Getting Started + +### Prerequisites + +- Rust 1.88.0 or later (edition 2024) +- [cargo-make](https://github.com/sagiegurari/cargo-make) for task automation +- Node.js 18+ (for Node.js bindings development) +- Python 3.9+ (for Python bindings development) + +### Setup + +1. Fork and clone the repository: + + ```bash + git clone https://github.com/YOUR_USERNAME/feedparser-rs.git + cd feedparser-rs + ``` + +2. Install cargo-make: + + ```bash + cargo install cargo-make + ``` + +3. Verify the setup: + + ```bash + cargo make ci-all + ``` + +## Development Workflow + +### Branch Naming + +Use descriptive branch names: + +- `feat/feature-name` — New features +- `fix/issue-description` — Bug fixes +- `docs/what-changed` — Documentation updates +- `refactor/what-changed` — Code refactoring +- `test/what-tested` — Test additions + +### Making Changes + +1. Create a new branch from `main`: + + ```bash + git checkout -b feat/your-feature + ``` + +2. Make your changes, following the [code style guidelines](#code-style). + +3. Run checks before committing: + + ```bash + cargo make pre-commit + ``` + +4. Commit your changes with a clear message: + + ```bash + git commit -m "feat: add support for XYZ" + ``` + +### Commit Message Format + +Follow [Conventional Commits](https://www.conventionalcommits.org/): + +- `feat:` — New feature +- `fix:` — Bug fix +- `docs:` — Documentation only +- `style:` — Formatting, no code change +- `refactor:` — Code change that neither fixes a bug nor adds a feature +- `test:` — Adding or updating tests +- `chore:` — Maintenance tasks + +Examples: + +``` +feat: add JSON Feed 1.1 support +fix: handle malformed RSS dates correctly +docs: update installation instructions +test: add tests for Atom parsing edge cases +``` + +## Code Style + +### Rust + +- Run `cargo make fmt` before committing +- All code must pass `cargo make clippy` without warnings +- Follow [Rust API Guidelines](https://rust-lang.github.io/api-guidelines/) +- Use `thiserror` for error types +- Prefer `&str` over `String` in function parameters where possible +- Document all public APIs with doc comments + +### Documentation + +- All public items must have documentation +- Include examples in doc comments where helpful +- Keep documentation in English + +### Testing + +- Write tests for new functionality +- Ensure all tests pass: `cargo make test` +- Add test fixtures to `tests/fixtures/` for new feed formats + +## Pull Request Process + +1. **Update documentation** — If your change affects the API, update relevant docs. + +2. **Add tests** — All new features and bug fixes should have tests. + +3. **Run all checks**: + + ```bash + cargo make pre-push + ``` + +4. **Create the pull request** with: + - Clear title following commit message format + - Description of what changed and why + - Reference to related issues (e.g., "Fixes #123") + +5. **Address review feedback** — Respond to comments and make requested changes. + +### PR Checklist + +- [ ] Code follows the project style guidelines +- [ ] All tests pass (`cargo make test`) +- [ ] Linting passes (`cargo make lint`) +- [ ] Documentation updated if needed +- [ ] CHANGELOG.md updated for notable changes + +## Testing + +### Running Tests + +```bash +# All tests +cargo make test + +# Rust tests only +cargo make test-rust + +# With coverage +cargo make coverage +``` + +### Test Fixtures + +Test feeds are located in `tests/fixtures/`. When adding support for new feed quirks: + +1. Add a minimal test fixture demonstrating the issue +2. Add a test that uses the fixture +3. Implement the fix +4. Verify the test passes + +## Reporting Issues + +### Bug Reports + +Include: + +- feedparser-rs version +- Rust/Python/Node.js version +- Minimal reproduction case +- Expected vs actual behavior +- Sample feed (if applicable, sanitized of sensitive data) + +### Feature Requests + +- Check existing issues first +- Describe the use case +- Explain why existing features don't solve the problem + +## Architecture Overview + +``` +feedparser-rs/ +├── crates/ +│ ├── feedparser-rs-core/ # Core Rust parser +│ ├── feedparser-rs-node/ # Node.js bindings (napi-rs) +│ └── feedparser-rs-py/ # Python bindings (PyO3) +├── tests/ +│ └── fixtures/ # Test feed files +└── benchmarks/ # Performance benchmarks +``` + +### Core Crate Structure + +- `src/lib.rs` — Public API +- `src/parser/` — Format-specific parsers (RSS, Atom, JSON Feed) +- `src/feed.rs` — Data structures +- `src/date.rs` — Date parsing +- `src/sanitize.rs` — HTML sanitization +- `src/encoding.rs` — Character encoding detection + +## Release Process + +Releases are automated via GitHub Actions. Maintainers tag releases following [Semantic Versioning](https://semver.org/): + +- MAJOR: Breaking API changes +- MINOR: New features, MSRV increases +- PATCH: Bug fixes, documentation + +## Getting Help + +- Open an issue for bugs or feature requests +- Start a discussion for questions + +## License + +By contributing, you agree that your contributions will be licensed under the same terms as the project (MIT OR Apache-2.0). diff --git a/README.md b/README.md index 651c54b..fe0fe6e 100644 --- a/README.md +++ b/README.md @@ -1,45 +1,38 @@ # feedparser-rs -High-performance RSS/Atom/JSON Feed parser for Rust, with Python and Node.js bindings. +[![Crates.io](https://img.shields.io/crates/v/feedparser-rs-core)](https://crates.io/crates/feedparser-rs-core) +[![docs.rs](https://img.shields.io/docsrs/feedparser-rs-core)](https://docs.rs/feedparser-rs-core) +[![CI](https://img.shields.io/github/actions/workflow/status/bug-ops/feedparser-rs/ci.yml?branch=main)](https://github.com/bug-ops/feedparser-rs/actions) +[![npm](https://img.shields.io/npm/v/feedparser-rs)](https://www.npmjs.com/package/feedparser-rs) +[![License](https://img.shields.io/crates/l/feedparser-rs-core)](LICENSE-MIT) -## Overview +High-performance RSS/Atom/JSON Feed parser for Rust, with Python and Node.js bindings. A drop-in replacement for Python's `feedparser` library with 10-100x performance improvement. -**feedparser-rs** is a drop-in replacement for Python's `feedparser` library, written in Rust for 10-100x performance improvement. +## Features -### Features - -- Parse RSS 0.9x, 1.0, 2.0 -- Parse Atom 0.3, 1.0 -- Parse JSON Feed 1.0, 1.1 -- Tolerant parsing with bozo flag pattern -- 100% API compatibility with Python feedparser -- Python bindings via PyO3 -- Node.js bindings via napi-rs - -## Status - -🚧 **Work in Progress** - Phase 4 (Node.js bindings + CI/CD) complete - -[![CI](https://github.com/bug-ops/feedparser-rs/workflows/CI/badge.svg)](https://github.com/bug-ops/feedparser-rs/actions) -[![Crates.io](https://img.shields.io/crates/v/feedparser-rs-core.svg)](https://crates.io/crates/feedparser-rs-core) -[![npm](https://img.shields.io/npm/v/feedparser-rs.svg)](https://www.npmjs.com/package/feedparser-rs) -[![Documentation](https://docs.rs/feedparser-rs-core/badge.svg)](https://docs.rs/feedparser-rs-core) -[![License](https://img.shields.io/badge/license-MIT%20OR%20Apache--2.0-blue.svg)](LICENSE-MIT) +- **Multi-format support** — RSS 0.9x, 1.0, 2.0 / Atom 0.3, 1.0 / JSON Feed 1.0, 1.1 +- **Tolerant parsing** — Handles malformed feeds with `bozo` flag pattern (like Python feedparser) +- **HTTP fetching** — Built-in support for fetching feeds from URLs with compression +- **Multi-language bindings** — Native Python (PyO3) and Node.js (napi-rs) bindings +- **feedparser-compatible API** — 100% API compatibility with Python feedparser ## Installation ### Rust +```bash +cargo add feedparser-rs-core +``` + +Or add to your `Cargo.toml`: + ```toml [dependencies] feedparser-rs-core = "0.1" ``` -### Python (Coming in Phase 4) - -```bash -pip install feedparser-rs -``` +> [!IMPORTANT] +> Requires Rust 1.88.0 or later (edition 2024). ### Node.js @@ -51,181 +44,168 @@ yarn add feedparser-rs pnpm add feedparser-rs ``` +### Python + +```bash +pip install feedparser-rs +``` + ## Usage -### Rust +### Rust Usage ```rust use feedparser_rs_core::parse; -let xml = r#" - - - - Example Feed - - -"#; - -let feed = parse(xml.as_bytes())?; -println!("Version: {}", feed.version.as_str()); -println!("Title: {}", feed.feed.title.unwrap()); +fn main() -> Result<(), Box> { + let xml = r#" + + + + Example Feed + https://example.com + + First Post + https://example.com/post/1 + + + + "#; + + let feed = parse(xml.as_bytes())?; + + println!("Version: {}", feed.version.as_str()); // "rss20" + println!("Title: {:?}", feed.feed.title); + println!("Entries: {}", feed.entries.len()); + + for entry in &feed.entries { + println!(" - {:?}", entry.title); + } + + Ok(()) +} ``` -### Python +#### Fetching from URL -```python -import feedparser_rs +```rust +use feedparser_rs_core::fetch_and_parse; -d = feedparser_rs.parse(b'...') -print(d.version) # 'rss20' -print(d.feed.title) +fn main() -> Result<(), Box> { + let feed = fetch_and_parse("https://example.com/feed.xml")?; + println!("Fetched {} entries", feed.entries.len()); + Ok(()) +} ``` -### Node.js +> [!TIP] +> Use `fetch_and_parse` for URL fetching with automatic compression handling (gzip, deflate, brotli). + +### Node.js Usage ```javascript -import { parse } from 'feedparser-rs'; +import { parse, fetchAndParse } from 'feedparser-rs'; +// Parse from string const feed = parse('...'); console.log(feed.version); // 'rss20' console.log(feed.feed.title); console.log(feed.entries.length); + +// Fetch from URL +const remoteFeed = await fetchAndParse('https://example.com/feed.xml'); ``` -See [crates/feedparser-rs-node/README.md](crates/feedparser-rs-node/README.md) for full Node.js API documentation. +See [Node.js API documentation](crates/feedparser-rs-node/README.md) for complete reference. -## Development +### Python Usage -This project uses [cargo-make](https://github.com/sagiegurari/cargo-make) for task automation. All development tasks are defined in `Makefile.toml`. +```python +import feedparser_rs -### Setup +# Parse from bytes or string +d = feedparser_rs.parse(b'...') +print(d.version) # 'rss20' +print(d.feed.title) +print(d.bozo) # True if parsing had issues +print(d.entries[0].published_parsed) # time.struct_time +``` -Install cargo-make: +> [!NOTE] +> Python bindings provide `time.struct_time` for date fields, matching the original feedparser API. -```bash -cargo install cargo-make -``` +## Cargo Features -### Available Tasks +| Feature | Description | Default | +|---------|-------------|---------| +| `http` | Enable URL fetching with reqwest (gzip/deflate/brotli support) | Yes | -View all available tasks: +To disable HTTP support: -```bash -cargo make --list-all-steps +```toml +[dependencies] +feedparser-rs-core = { version = "0.1", default-features = false } ``` -### Common Development Tasks +## Workspace Structure -#### Formatting +This repository contains multiple crates: -```bash -# Format code with nightly rustfmt -cargo make fmt +| Crate | Description | Package | +|-------|-------------|---------| +| [`feedparser-rs-core`](crates/feedparser-rs-core) | Core Rust parser | [crates.io](https://crates.io/crates/feedparser-rs-core) | +| [`feedparser-rs-node`](crates/feedparser-rs-node) | Node.js bindings | [npm](https://www.npmjs.com/package/feedparser-rs) | +| [`feedparser-rs-py`](crates/feedparser-rs-py) | Python bindings | [PyPI](https://pypi.org/project/feedparser-rs) | -# Check formatting without modifying files -cargo make fmt-check -``` +## Development -#### Linting +This project uses [cargo-make](https://github.com/sagiegurari/cargo-make) for task automation. ```bash -# Run clippy -cargo make clippy - -# Run all linting checks (format + clippy + doc) -cargo make lint -``` - -#### Testing +# Install cargo-make +cargo install cargo-make -```bash -# Run Rust tests -cargo make test-rust +# Run all checks (format, lint, test) +cargo make ci-all -# Run all tests (Rust + Python + Node.js) +# Run tests cargo make test -# Run doctests -cargo make doctest +# Run benchmarks +cargo make bench ``` -#### Security +See all available tasks: ```bash -# Run all security checks -cargo make deny - -# Run specific security checks -cargo make deny-advisories -cargo make deny-licenses +cargo make --list-all-steps ``` -#### Coverage - -```bash -# Generate Rust coverage -cargo make coverage-rust - -# Generate all coverage reports -cargo make coverage -``` +## Benchmarks -#### Benchmarks +Run benchmark comparison against Python feedparser: ```bash -# Run Rust benchmarks -cargo make bench - -# Compare Rust vs Python performance cargo make bench-compare ``` -#### Utilities - -```bash -# Check for outdated dependencies -cargo make check-versions -``` - -#### Pre-commit/Pre-push - -```bash -# Run checks before committing -cargo make pre-commit - -# Run comprehensive checks before pushing -cargo make pre-push -``` - -#### CI Simulation - -```bash -# Run all CI checks locally -cargo make ci-all -``` +## MSRV Policy -### Build +Minimum Supported Rust Version: **1.88.0** (edition 2024). -```bash -cargo build --workspace -# or -cargo make build -``` +MSRV increases are considered breaking changes and will result in a minor version bump. ## License Licensed under either of: -- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE)) -- MIT license ([LICENSE-MIT](LICENSE-MIT)) +- [Apache License, Version 2.0](LICENSE-APACHE) +- [MIT License](LICENSE-MIT) at your option. ## Contributing -Contributions are welcome! Please read our contributing guidelines. - -### Code of Conduct +Contributions are welcome! Please read our [Contributing Guide](CONTRIBUTING.md) before submitting a pull request. -This project follows the Rust Code of Conduct. +This project follows the [Rust Code of Conduct](https://www.rust-lang.org/policies/code-of-conduct). diff --git a/crates/feedparser-rs-core/README.md b/crates/feedparser-rs-core/README.md index 1a5c482..6f30e8c 100644 --- a/crates/feedparser-rs-core/README.md +++ b/crates/feedparser-rs-core/README.md @@ -10,6 +10,8 @@ This is the core parsing library that powers the Python and Node.js bindings. - **Tolerant**: Bozo flag for malformed feeds (like Python feedparser) - **Fast**: Written in Rust, 10-100x faster than Python feedparser - **Safe**: No unsafe code, comprehensive error handling +- **HTTP support**: Fetch feeds from URLs with compression and conditional GET +- **Podcast support**: iTunes and Podcast 2.0 namespace extensions - **Well-tested**: Extensive test coverage with real-world feed fixtures ## Installation @@ -43,6 +45,38 @@ assert_eq!(feed.entries.len(), 1); # Ok::<(), feedparser_rs_core::FeedError>(()) ``` +## HTTP Fetching + +Fetch feeds directly from URLs with automatic compression handling: + +```rust +use feedparser_rs_core::parse_url; + +let feed = parse_url("https://example.com/feed.xml", None, None, None)?; +println!("Title: {:?}", feed.feed.title); +println!("Entries: {}", feed.entries.len()); + +// Subsequent fetch with caching (uses ETag/Last-Modified) +let feed2 = parse_url( + "https://example.com/feed.xml", + feed.etag.as_deref(), + feed.modified.as_deref(), + None +)?; + +if feed2.status == Some(304) { + println!("Not modified, use cached version"); +} +# Ok::<(), feedparser_rs_core::FeedError>(()) +``` + +To disable HTTP support and reduce dependencies: + +```toml +[dependencies] +feedparser-rs-core = { version = "0.1", default-features = false } +``` + ## Platform Bindings - **Node.js**: [`feedparser-rs`](https://www.npmjs.com/package/feedparser-rs) on npm