Webz.io Firehose Consumer

A simple and robust Python client for consuming real-time data from the Webz.io Firehose API with automatic pagination, intelligent error handling, and rate limiting.

Features

Real-time data consumption with automatic pagination
Intelligent rate limiting - handles HTTP 429 with exponential backoff
Comprehensive error handling - network errors, HTTP errors, and timeouts
Command-line interface - no code editing required
Detailed logging - timestamps, response times, status codes, and URLs
Zero posts handling - automatic retry when no new data is available

Quick Start

Install dependencies:

pip install requests

Run the consumer:

python3 simple_consumer.py --token YOUR_TOKEN --firehose YOUR_FIREHOSE_NAME

Stop the consumer: Press Ctrl+C to stop gracefully

Usage Examples

Basic consumption (last 5 minutes)

python3 simple_consumer.py --token abc123xyz --firehose news_feed

Start from 10 minutes ago

python3 simple_consumer.py --token abc123xyz --firehose news_feed --start-minutes 10

Show all available options

python3 simple_consumer.py --help

Command Line Parameters

Parameter	Required	Default	Description
`--token`	Yes	-	Your Webz.io API token (provided by Webz.io team)
`--firehose`	Yes	-	Your firehose name (provided by Webz.io team)
`--start-minutes`	No	5	Start consuming from X minutes ago

Output Format

The script provides detailed real-time logs for each API request:

Successful requests

[2025-07-20 11:10:46] Request #250: 100 posts | Status: 200 | Response time: 0.09s | API since: 07-20 11:08:37 | Total posts: 20256
  URL: https://api.webz.io/firehose?token=...&since=1752998917000&nid=...
  → Posts found, continuing to next page...

Zero posts (waiting for new data)

[2025-07-20 11:10:49] Request #251: 0 posts | Status: 200 | Response time: 0.03s | API since: 07-20 11:08:40 | Total posts: 20256
  URL: https://api.webz.io/firehose?token=...&since=1752998920000&nid=...
  → No posts found, sleeping 2 seconds...

Rate limiting (HTTP 429)

[2025-07-20 11:10:52] Request #252: RATE LIMIT ERROR | Status: 429 | Response time: 0.02s
  → Waiting 10 seconds due to rate limit...

Error Handling

Rate Limiting (HTTP 429)

The script automatically handles rate limits with intelligent backoff:

First rate limit: waits 10 seconds
Subsequent rate limits: adds 5 seconds each time (10s → 15s → 20s → 25s... up to 60s max)
Reset: back to 10 seconds after successful requests
No request counting: rate-limited requests don't increment the counter

Note: If you frequently receive 429 errors, contact the Webz.io team to increase your API rate limits.

HTTP Errors (401, 403, 404, 500, etc.)

Logs the error with status code
Waits 2 seconds between retries
Continues indefinitely until resolved

Network Errors

Handles connection timeouts, DNS issues, etc.
Logs detailed error information
Automatic retry with 2-second delay

Zero Posts Handling

When API returns 0 posts, waits 2 seconds before retry
Maintains the same URL until new data arrives
Prevents excessive API calls during quiet periods

Pagination

The script automatically follows pagination based on API response:

Posts found: immediately continues to next page
Zero posts returned: waits 2 seconds, then retries same URL

Why 2 seconds?

Consistent timing: When no posts are returned, you've caught up to real-time
Efficient polling: Short enough to get new data quickly, long enough to avoid excessive API calls

Stopping the Consumer

Press Ctrl+C to stop gracefully
Shows final statistics (total requests and posts processed)
Properly closes all connections

Example:

^C
Stopped by user after 1,247 requests
Total posts processed: 125,890

Technical Details

HTTP Timeout: 30 seconds per request
Session reuse: Maintains persistent connections for better performance
Memory efficient: Processes data in real-time without accumulation
Timestamp format: Human-readable date/time in logs
URL extraction: Shows actual API URLs for debugging

Requirements

Python 3.6+
requests library

Troubleshooting

"Command 'python' not found"

Use python3 instead of python:

python3 simple_consumer.py --help

"Error: API token and firehose name are required!"

Make sure you provide both required parameters:

python3 simple_consumer.py --token YOUR_TOKEN --firehose YOUR_FIREHOSE

Continuous 401 errors

Verify your API token is correct
Contact Webz.io team to confirm token is active

Continuous 429 errors

The script handles this automatically
If persists, contact Webz.io team about rate limits
Solution: Request higher rate limits from Webz.io team

Support

Get credentials: Contact the Webz.io team for your API token and firehose name
Issues: Check the detailed logs for specific error messages
Performance: The script is optimized for continuous long-running consumption

This consumer is designed for production use with robust error handling and automatic recovery.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
simple_consumer.py		simple_consumer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Webz.io Firehose Consumer

Features

Quick Start

Usage Examples

Basic consumption (last 5 minutes)

Start from 10 minutes ago

Show all available options

Command Line Parameters

Output Format

Successful requests

Zero posts (waiting for new data)

Rate limiting (HTTP 429)

Error Handling

Rate Limiting (HTTP 429)

HTTP Errors (401, 403, 404, 500, etc.)

Network Errors

Zero Posts Handling

Pagination

Why 2 seconds?

Stopping the Consumer

Technical Details

Requirements

Troubleshooting

"Command 'python' not found"

"Error: API token and firehose name are required!"

Continuous 401 errors

Continuous 429 errors

Support

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Webhose/webzio-firehose-api-consumer

Folders and files

Latest commit

History

Repository files navigation

Webz.io Firehose Consumer

Features

Quick Start

Usage Examples

Basic consumption (last 5 minutes)

Start from 10 minutes ago

Show all available options

Command Line Parameters

Output Format

Successful requests

Zero posts (waiting for new data)

Rate limiting (HTTP 429)

Error Handling

Rate Limiting (HTTP 429)

HTTP Errors (401, 403, 404, 500, etc.)

Network Errors

Zero Posts Handling

Pagination

Why 2 seconds?

Stopping the Consumer

Technical Details

Requirements

Troubleshooting

"Command 'python' not found"

"Error: API token and firehose name are required!"

Continuous 401 errors

Continuous 429 errors

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages