feat(api): add observability baseline with structured logging and telemetry#6
Merged
raychrisgdp merged 19 commits intomainfrom Jan 3, 2026
Merged
feat(api): add observability baseline with structured logging and telemetry#6raychrisgdp merged 19 commits intomainfrom
raychrisgdp merged 19 commits intomainfrom
Conversation
- Set DEBUG to false in .env.example for production readiness. - Removed unnecessary database and LLM configuration options from .env.example. - Updated the Typer dependency in pyproject.toml and uv.lock to remove the 'all' extra, simplifying the installation process. - Improved developer quickstart instructions for installing dependencies and running the application. - Enhanced PR-002 task CRUD API documentation with additional details on response shapes and pagination. These changes aim to streamline configuration, clarify setup instructions, and improve API documentation.
- Added a new API v1 for task management, including endpoints for creating, retrieving, updating, and deleting tasks. - Introduced task schemas for request validation and response formatting. - Implemented error handling for task not found scenarios with a standardized error response. - Updated the Makefile to include precommit checks in the test coverage command. - Removed linting step from CI workflow to streamline the testing process. These changes enhance the API functionality for task management and improve error handling, contributing to a more robust application.
- Add structured JSON logging with redaction filter - Implement request correlation IDs via middleware - Add telemetry endpoint with DB health and migration version - Add comprehensive test coverage (27 tests) - Update PR-016 spec with implementation details
- Replace magic values with constants (HTTP_OK, UUID_LENGTH, etc.) - Move imports to top level - Remove unused imports - Fix PLR2004 and PLC0415 violations
- Add model_validator to TaskUpdate to reject title: null (prevents DB integrity errors) - Fix async generator return type annotation in test fixture - Add noqa comment for magic number in pagination test - Add test for null title rejection Fixes CI/CD issues: mypy errors and ruff warnings
feat(api): implement task CRUD API endpoints
- Add structured JSON logging with redaction filter - Implement request correlation IDs via middleware - Add telemetry endpoint with DB health and migration version - Add comprehensive test coverage (27 tests) - Update PR-016 spec with implementation details
- Replace magic values with constants (HTTP_OK, UUID_LENGTH, etc.) - Move imports to top level - Remove unused imports - Fix PLR2004 and PLC0415 violations
- Set logger level explicitly for backend.middleware logger - Use caplog.at_level() with specific logger name to capture logs - Fixes test failures where logs weren't being captured
- Set logger.propagate = True to ensure logs reach root logger - Set root logger level explicitly for caplog capture - Fixes test failures after rebase onto main
- Configure logger at module level to ensure logs are captured - Set logger levels to DEBUG in test functions for better isolation - Ensures tests pass when run individually or with PR-016 test suite - Note: test isolation issue persists when running full test suite in parallel
- Merge tasks router and telemetry router in main.py - Keep logger configuration in test_middleware.py - Keep type annotation in api/v1/__init__.py
…dler - Use custom LogCaptureHandler instead of caplog for better isolation - Set propagate=False to avoid interference from setup_logging() - Ensure logger is enabled and configured right before request - Add verification checks to ensure logger is properly configured
- Added details on structured logging and telemetry configuration. - Updated logging section with environment variables and examples. - Included information about the telemetry endpoint and its usage. - Clarified request_id handling in logging format.
- Added 'make lock' target to update the uv.lock file after modifying pyproject.toml. - Enhanced 'make dev' and 'make install-all' to use 'uv sync' if uv.lock exists, ensuring consistent dependency installation. - Updated CI workflow to reflect changes in dependency installation logic.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Issues & Goals:
Implementation Highlights:
backend/logging.py): New JSON formatter with context-awarerequest_idfield, redaction filter for sensitive keys and email addresses, and dual output (stdout + rotating file handler)backend/middleware.py): Request logging middleware that generates/reuses correlation IDs, logs HTTP requests with method/path/status/duration, handles exceptions with error logging, and echoesX-Request-Idin response headersbackend/api/v1/telemetry.py): New/api/v1/telemetryendpoint providing system health metrics (status, version, uptime), database connectivity and migration version, graceful degradation on errors, and optional metrics placeholders for future PRsbackend/config.py): AddedLOG_LEVEL,TELEMETRY_ENABLED, andLOG_FILE_PATHsettings with helper methods for log level resolution and file path defaultsbackend/main.py,backend/cli/main.py): Integrated logging setup in FastAPI lifespan and CLI startup, registered middleware and telemetry router conditionally based on settingsdocs/USER_GUIDE.md,README.md): Added comprehensive observability section covering environment variables, telemetry endpoint usage, and log file configurationHow to Test
Prerequisites:
Backend API Testing:
Start the API server:
Test structured logging:
curl http://127.0.0.1:8080/health)timestamp,level,logger,message,request_idrequest_idis present (UUID4 format) and echoed in response headers asX-Request-Id~/.taskgenie/logs/taskgenie.jsonl(or configuredLOG_FILE_PATH)Test request logging middleware:
GET /health,GET /api/v1/taskshttp_requestevent with fields:method,path,status,duration_msX-Request-Idheader is present in all responsescurl -H "X-Request-Id: test-id-123" http://127.0.0.1:8080/healthand verify the same ID is echoed backTest log redaction:
authorization,token,password,secret,cookie,email) are redacted as[redacted][redacted-email]Test telemetry endpoint:
status("ok" or "degraded"),version,uptime_s,db.connected,db.migration_versionoptional.event_queue_sizeandoptional.agent_runs_activeare present withnullvaluesstatus="degraded"with error message indb.errorTest configuration:
LOG_LEVEL=DEBUGand verify debug logs appearTELEMETRY_ENABLED=falseand verify/api/v1/telemetryreturns 404LOG_FILE_PATH=/tmp/test.jsonland verify logs are written to custom pathDEBUG=trueautomatically sets log level to DEBUGCLI Testing:
request_idisnullfor CLI operations (not in request context)Expected Behavior:
X-Request-Idheader for tracingRelated Issues
Author Checklist
mainbranchAdditional Notes
Key Implementation Areas for Review
Backend API:
backend/logging.py: JSON formatter implementation, redaction filter logic, context variable usage for request ID propagationbackend/middleware.py: Request ID generation/reuse logic, HTTP request logging, exception handling, response header injectionbackend/api/v1/telemetry.py: Database health checks, migration version retrieval, graceful error handlingbackend/config.py: New observability settings and helper methods for log level/file path resolutionbackend/main.py: Logging setup in lifespan, middleware registration, conditional telemetry router registrationTesting:
tests/test_logging.py: Unit tests for JSON formatter, redaction filter, and logging setuptests/test_middleware.py: Middleware tests for request ID handling, request logging, error loggingtests/api/test_telemetry.py: Integration tests for telemetry endpoint response shape and degraded statusDocumentation:
docs/USER_GUIDE.md: Comprehensive observability section with environment variables and telemetry usageREADME.md: Quick reference for observability featuresTesting Notes
timestamp,level,logger,message,request_id,event,method,path,status,duration_ms)status,version,uptime_s,db.connected,db.migration_version~/.taskgenie/logs/taskgenie.jsonlwith proper JSON formattingrequest_idis properly scoped per request and doesn't leak between requests - Verified via manual testing