A simple and robust Python client for consuming real-time data from the Webz.io Firehose API with automatic pagination, intelligent error handling, and rate limiting.
- Real-time data consumption with automatic pagination
- Intelligent rate limiting - handles HTTP 429 with exponential backoff
- Comprehensive error handling - network errors, HTTP errors, and timeouts
- Command-line interface - no code editing required
- Detailed logging - timestamps, response times, status codes, and URLs
- Zero posts handling - automatic retry when no new data is available
- Install dependencies:
pip install requests
- Run the consumer:
python3 simple_consumer.py --token YOUR_TOKEN --firehose YOUR_FIREHOSE_NAME
- Stop the consumer:
Press
Ctrl+C
to stop gracefully
python3 simple_consumer.py --token abc123xyz --firehose news_feed
python3 simple_consumer.py --token abc123xyz --firehose news_feed --start-minutes 10
python3 simple_consumer.py --help
Parameter | Required | Default | Description |
---|---|---|---|
--token |
Yes | - | Your Webz.io API token (provided by Webz.io team) |
--firehose |
Yes | - | Your firehose name (provided by Webz.io team) |
--start-minutes |
No | 5 | Start consuming from X minutes ago |
The script provides detailed real-time logs for each API request:
[2025-07-20 11:10:46] Request #250: 100 posts | Status: 200 | Response time: 0.09s | API since: 07-20 11:08:37 | Total posts: 20256
URL: https://api.webz.io/firehose?token=...&since=1752998917000&nid=...
→ Posts found, continuing to next page...
[2025-07-20 11:10:49] Request #251: 0 posts | Status: 200 | Response time: 0.03s | API since: 07-20 11:08:40 | Total posts: 20256
URL: https://api.webz.io/firehose?token=...&since=1752998920000&nid=...
→ No posts found, sleeping 2 seconds...
[2025-07-20 11:10:52] Request #252: RATE LIMIT ERROR | Status: 429 | Response time: 0.02s
→ Waiting 10 seconds due to rate limit...
The script automatically handles rate limits with intelligent backoff:
- First rate limit: waits 10 seconds
- Subsequent rate limits: adds 5 seconds each time (10s → 15s → 20s → 25s... up to 60s max)
- Reset: back to 10 seconds after successful requests
- No request counting: rate-limited requests don't increment the counter
Note: If you frequently receive 429 errors, contact the Webz.io team to increase your API rate limits.
- Logs the error with status code
- Waits 2 seconds between retries
- Continues indefinitely until resolved
- Handles connection timeouts, DNS issues, etc.
- Logs detailed error information
- Automatic retry with 2-second delay
- When API returns 0 posts, waits 2 seconds before retry
- Maintains the same URL until new data arrives
- Prevents excessive API calls during quiet periods
The script automatically follows pagination based on API response:
- Posts found: immediately continues to next page
- Zero posts returned: waits 2 seconds, then retries same URL
- Consistent timing: When no posts are returned, you've caught up to real-time
- Efficient polling: Short enough to get new data quickly, long enough to avoid excessive API calls
- Press
Ctrl+C
to stop gracefully - Shows final statistics (total requests and posts processed)
- Properly closes all connections
Example:
^C
Stopped by user after 1,247 requests
Total posts processed: 125,890
- HTTP Timeout: 30 seconds per request
- Session reuse: Maintains persistent connections for better performance
- Memory efficient: Processes data in real-time without accumulation
- Timestamp format: Human-readable date/time in logs
- URL extraction: Shows actual API URLs for debugging
- Python 3.6+
requests
library
Use python3
instead of python
:
python3 simple_consumer.py --help
Make sure you provide both required parameters:
python3 simple_consumer.py --token YOUR_TOKEN --firehose YOUR_FIREHOSE
- Verify your API token is correct
- Contact Webz.io team to confirm token is active
- The script handles this automatically
- If persists, contact Webz.io team about rate limits
- Solution: Request higher rate limits from Webz.io team
- Get credentials: Contact the Webz.io team for your API token and firehose name
- Issues: Check the detailed logs for specific error messages
- Performance: The script is optimized for continuous long-running consumption
This consumer is designed for production use with robust error handling and automatic recovery.