This release improves the robustness of the Web Scraper task, specifically addressing issues with sites like Wikipedia.
- User-Agent Header: Added a browser-like
User-Agentheader toscrape_websiteto prevent 403 Forbidden errors from sites with strict bot protection (e.g., Wikipedia). - Error Propagation: Enhanced exception handling in the worker to explicitly update the task status to
FAILUREin the database, ensuring errors are correctly reported to the UI instead of getting stuck inPROGRESS.
This release adds a real-world "Web Scraper" task to demonstrate practical long-running operations and external HTTP tracing.
- Web Scraper Task: A new background task
scrape_website(url)that fetches a URL, parses HTML withBeautifulSoup, and extracts metadata (Title, H1 count, Link count). - Observability: Added
opentelemetry-instrumentation-requeststo automatically trace external HTTP calls made by the worker. Use Jaeger to see theGET <url>span! - UI Updates: Added a Task Type selector (Vector vs. Scraper) and URL input to the frontend.
This hotfix disables SQLAlchemyInstrumentor to resolve a critical greenlet context error when used with the asyncpg driver in FastAPI.
- Fix 500 Internal Server Error: Resolved
sqlalchemy.exc.MissingGreenleterrors in the API by disabling automatic SQLAlchemy tracing. Request tracing (FastAPI) and Worker tracing (Celery) remain active and fully functional.
This release introduces full-stack observability using OpenTelemetry and Jaeger, allowing developers to visualize and trace requests as they flow through the system (API -> Redis -> Worker -> Database).
- Distributed Tracing: Integrated Jaeger (v1.53) to visualize request lifecycles.
- Instrumentation: Added OpenTelemetry instrumentation for:
- FastAPI: Trace HTTP requests and latency.
- Celery: Trace background task execution and queuing time.
- SQLAlchemy: Trace database queries.
- Redis: Trace broker interactions.
- Jaeger Service: Added
jaegercontainer todocker-compose.yml, exposing the UI on port 16686. - Telemetry Utility: Added
app/core/telemetry.pyto simplify OTLP configuration.
This major release transitions the application's persistence layer from SQLite to PostgreSQL, addressing critical scalability and concurrency limitations. It also includes significant stability improvements for Server-Sent Events (SSE) and schema validation.
- Production-Grade Database: Switched from SQLite to PostgreSQL 15 running in Docker. This enables:
- High Concurrency: Multiple workers can now process tasks simultaneously without
database is lockederrors. - Row-Level Locking: Improved data integrity during high-throughput operations.
- Scalability: The database is now decoupled from the application file system, allowing for independent scaling.
- High Concurrency: Multiple workers can now process tasks simultaneously without
- SSE Real-Time Updates:
- Fixed: Resolved a critical issue where the UI progress bar remained stuck at 0%.
- Root Cause: Mismatch between Celery's auto-generated Task ID and the application's Database ID.
- Resolution: Enforced ID synchronization (
task_id=job.id) and updated endpoints to usecelery_app.AsyncResultfor correct Redis backend lookups.
- API 500 Errors:
- Fixed: Resolved
ResponseValidationErrorwhen fetching task status. - Root Cause: Pydantic schema field
task_iddid not match SQLAlchemy model fieldid. - Resolution: Added
serialization_aliastoTaskStatusResponseand correctly mapped eager-loaded relationships.
- Fixed: Resolved
- Unified Docker Stack: Updated
docker-compose.ymlto orchestrate the entire platform (API, Worker, Redis, Postgres) with a single command:docker-compose up -d. - Port Reconfiguration:
- API: Moved to port 8001 (default) to prevent conflicts with other local services.
- PostgreSQL: Exposed on port 5433 to avoid clashes with local Postgres instances.
- Verification Scripts: Added
verify_sse.pyand updatedverify_persistence.pyto support the new port configuration and end-to-end testing flows.
- Dependency Updates: Added
asyncpg(for high-performance async DB access) andpsycopg2-binary(for robust synchronous Celery worker access). - Configuration Management: Centralized database connection strings in
app/core/config.pyto support both Sync and Async drivers easily.
This release focuses on providing comprehensive technical guidance for architects, DevOps engineers, and developers implementing long-running operations.
- Infrastructure Sizing Guide: Detailed calculations for sizing Azure AKS clusters, including Pod resources, Node Pools, and HPA/KEDA configurations.
- Best Practices & Anti-Patterns: A guide on the "Claim Check" pattern for large payloads, idempotency, and common pitfalls to avoid.
- Streaming Guide: Technical deep dive into Server-Sent Events (SSE) implementation.
- README.md: Enhanced with a clear "Problem Statement" explaining why long-running HTTP requests are an anti-pattern and how this solution addresses it.
- README.md: Added a "Technical Guides" table for easier navigation.