1- # feedparser-rs-py
1+ # feedparser-rs
22
3- High-performance RSS/Atom/JSON Feed parser for Python — drop-in replacement for ` feedparser ` .
3+ High-performance RSS/Atom/JSON Feed parser for Python with feedparser-compatible API .
44
55## Features
66
7- - 🚀 ** 10-100x faster** than feedparser (Rust core)
8- - 🔄 ** 100% API compatible** with feedparser 6.x
9- - ✅ ** Tolerant parsing** with bozo flag for malformed feeds
10- - 📦 ** Zero dependencies** (pure Rust + PyO3)
11- - 🎯 ** Supports all formats** : RSS 0.9x/1.0/2.0, Atom 0.3/1.0, JSON Feed 1.0/1.1
12- - 🎙️ ** Podcast metadata** : iTunes tags, Podcast 2.0 namespace
13- - 🛡️ ** DoS protection** : Built-in resource limits
7+ - ** Fast** : Native Rust implementation via PyO3
8+ - ** Tolerant parsing** : Bozo flag for graceful handling of malformed feeds
9+ - ** Multi-format** : RSS 0.9x/1.0/2.0, Atom 0.3/1.0, JSON Feed 1.0/1.1
10+ - ** Podcast support** : iTunes and Podcast 2.0 namespace extensions
11+ - ** Familiar API** : Inspired by feedparser, easy migration path
12+ - ** DoS protection** : Built-in resource limits
1413
1514## Installation
1615
@@ -20,22 +19,14 @@ pip install feedparser-rs
2019
2120## Usage
2221
23- ** Same API as feedparser:**
24-
2522``` python
2623import feedparser_rs
2724
28- # From string
25+ # Parse from string or bytes
2926d = feedparser_rs.parse(' <rss>...</rss>' )
30-
31- # From bytes
3227d = feedparser_rs.parse(b ' <rss>...</rss>' )
3328
34- # From file
35- with open (' feed.xml' , ' rb' ) as f:
36- d = feedparser_rs.parse(f.read())
37-
38- # Access data (feedparser-compatible)
29+ # Access data
3930print (d.feed.title)
4031print (d.version) # "rss20", "atom10", etc.
4132print (d.bozo) # True if parsing errors occurred
@@ -47,57 +38,37 @@ for entry in d.entries:
4738
4839## Migration from feedparser
4940
50- ** No code changes needed:**
51-
5241``` python
53- # Before
54- import feedparser
55- d = feedparser.parse(feed_url_or_content)
56-
57- # After - just change the import!
42+ # Option 1: alias import
5843import feedparser_rs as feedparser
59- d = feedparser.parse(feed_url_or_content)
60- ```
61-
62- Or use it directly:
44+ d = feedparser.parse(feed_content)
6345
64- ``` python
46+ # Option 2: direct import
6547import feedparser_rs
6648d = feedparser_rs.parse(feed_content)
6749```
6850
69- ## Performance
70-
71- Benchmark parsing 1000-entry RSS feed (10 iterations):
72-
73- | Library | Time | Speedup |
74- | ---------| ------| ---------|
75- | feedparser 6.0.11 | 2.45s | 1x |
76- | feedparser-rs 0.1.0 | 0.12s | ** 20x** |
51+ > ** Note** : URL fetching is not yet implemented. Use ` requests.get(url).content ` to fetch feeds.
7752
7853## Advanced Usage
7954
8055### Custom Resource Limits
8156
82- Protect against DoS attacks from malicious feeds:
83-
8457``` python
8558import feedparser_rs
8659
8760limits = feedparser_rs.ParserLimits(
8861 max_feed_size_bytes = 50_000_000 , # 50 MB
8962 max_entries = 5_000 ,
90- max_authors = 20 , # Max authors per feed/entry
91- max_links_per_entry = 50 , # Max links per entry
63+ max_authors = 20 ,
64+ max_links_per_entry = 50 ,
9265)
9366
9467d = feedparser_rs.parse_with_limits(feed_data, limits)
9568```
9669
9770### Format Detection
9871
99- Quickly detect feed format without full parsing:
100-
10172``` python
10273import feedparser_rs
10374
@@ -107,8 +78,6 @@ print(version) # "rss20", "atom10", "json11", etc.
10778
10879### Podcast Support
10980
110- Access iTunes and Podcast 2.0 metadata:
111-
11281``` python
11382import feedparser_rs
11483
@@ -118,98 +87,51 @@ d = feedparser_rs.parse(podcast_feed)
11887if d.feed.itunes:
11988 print (d.feed.itunes.author)
12089 print (d.feed.itunes.categories)
121- print (d.feed.itunes.explicit)
12290
12391# Episode metadata
12492for entry in d.entries:
12593 if entry.itunes:
126- print (f " S { entry.itunes.season} E { entry.itunes.episode} " )
12794 print (f " Duration: { entry.itunes.duration} s " )
128-
129- # Podcast 2.0
130- if d.feed.podcast:
131- for person in d.feed.podcast.persons:
132- print (f " { person.name} ( { person.role} ) " )
13395```
13496
13597## API Reference
13698
137- ### Main Functions
99+ ### Functions
138100
139- - ` parse(source) ` - Parse feed from bytes, str, or file
140- - ` parse_with_limits(source, limits) ` - Parse with custom resource limits
141- - ` detect_format(source) ` - Detect feed format
101+ - ` parse(source) ` — Parse feed from bytes or str
102+ - ` parse_with_limits(source, limits) ` — Parse with custom resource limits
103+ - ` detect_format(source) ` — Detect feed format without full parsing
142104
143105### Classes
144106
145- - ` FeedParserDict ` - Parsed feed result
146- - ` .feed ` - Feed metadata
147- - ` .entries ` - List of entries
148- - ` .bozo ` - True if parsing errors occurred
149- - ` .bozo_exception ` - Error description
150- - ` .version ` - Feed version string
151- - ` .encoding ` - Character encoding
152- - ` .namespaces ` - XML namespaces
153-
154- - ` ParserLimits ` - Resource limits configuration
155-
156- ### Feed Metadata
157-
158- - ` title ` , ` subtitle ` , ` link ` - Basic metadata
159- - ` updated_parsed ` - Update date as ` time.struct_time `
160- - ` authors ` , ` contributors ` - Person lists
161- - ` image ` , ` icon ` , ` logo ` - Feed images
162- - ` itunes ` - iTunes podcast metadata
163- - ` podcast ` - Podcast 2.0 metadata
164-
165- ### Entry Metadata
166-
167- - ` title ` , ` summary ` , ` content ` - Entry text
168- - ` link ` , ` links ` - Entry URLs
169- - ` published_parsed ` , ` updated_parsed ` - Dates as ` time.struct_time `
170- - ` authors ` , ` contributors ` - Person lists
171- - ` enclosures ` - Media attachments
172- - ` itunes ` - Episode metadata
173-
174- ## Compatibility
107+ - ` FeedParserDict ` — Parsed feed result
108+ - ` .feed ` — Feed metadata
109+ - ` .entries ` — List of entries
110+ - ` .bozo ` — True if parsing errors occurred
111+ - ` .version ` — Feed version string
112+ - ` .encoding ` — Character encoding
175113
176- This library aims for 100% API compatibility with ` feedparser ` 6.x. All field names, data structures, and behaviors match feedparser.
177-
178- Key differences:
179- - ** URL fetching not implemented yet** - Use ` requests.get(url).content `
180- - ** Performance** - 10-100x faster
181- - ** Error handling** - Same tolerant parsing with bozo flag
114+ - ` ParserLimits ` — Resource limits configuration
182115
183116## Requirements
184117
185118- Python >= 3.9
186- - No runtime dependencies (Rust extension module)
187119
188120## Development
189121
190- Build from source:
191-
192122``` bash
193- git clone https://github.com/rabax /feedparser-rs
123+ git clone https://github.com/bug-ops /feedparser-rs
194124cd feedparser-rs/crates/feedparser-rs-py
195125pip install maturin
196126maturin develop
197127```
198128
199- Run tests:
200-
201- ``` bash
202- pip install pytest
203- pytest tests/
204- ```
205-
206129## License
207130
208131MIT OR Apache-2.0
209132
210133## Links
211134
212- - ** GitHub** : https://github.com/rabax/feedparser-rs
213- - ** PyPI** : https://pypi.org/project/feedparser-rs/
214- - ** Documentation** : https://github.com/rabax/feedparser-rs#readme
215- - ** Bug Reports** : https://github.com/rabax/feedparser-rs/issues
135+ - [ GitHub] ( https://github.com/bug-ops/feedparser-rs )
136+ - [ PyPI] ( https://pypi.org/project/feedparser-rs/ )
137+ - [ Issues] ( https://github.com/bug-ops/feedparser-rs/issues )
0 commit comments