Clean up documentation: reduce emojis and update service references

Kamal Sai Devarapalli · Kamal Sai Devarapalli · commit 05f40602d7f5 · 2025-12-19T02:59:56.000-07:00
- Remove checkmark emojis from LOAD_TEST_RESULTS.md
- Update Redis documentation (booking → taskprocessing references)
- Update Vault setup paths (booking → taskprocessing)
- Update log monitoring docs with correct service names
- Add legacy class name notes where appropriate
diff --git a/LOAD_TEST_RESULTS.md b/LOAD_TEST_RESULTS.md
@@ -81,13 +81,13 @@
 
 ## Key Observations
 
-### ✅ Excellent Performance
+### Excellent Performance
 1. **Response Times:** All services show sub-15ms average response times
 2. **Throughput:** All services handle **1,200-1,450 req/sec** easily
 3. **Success Rate:** 100% success rate for all working endpoints
 4. **Stability:** No errors or timeouts during testing
 
-### ✅ Gunicorn Configuration Working
+### Gunicorn Configuration Working
 - **4 workers confirmed** running in each service
 - Workers are handling concurrent requests efficiently
 - No worker exhaustion or queuing delays observed
@@ -111,7 +111,7 @@
 
 ### After (Gunicorn - 4 Workers × 2 Threads)
 - **Concurrent requests:** 8 simultaneous
-- **Actual throughput:** **1,200-1,450 req/sec** ✅
+- **Actual throughput:** **1,200-1,450 req/sec**
 - **Response times:** 12-15ms average (excellent)
 
 **Improvement:** ~12-25x increase in throughput!
@@ -123,7 +123,7 @@
 ### Current Configuration (4 workers × 2 threads)
 - **Theoretical max concurrent:** 8 requests per instance
 - **Actual measured throughput:** ~1,300 req/sec per service
-- **Target capacity (1000-2000 req/sec):** ✅ **ACHIEVED**
+- **Target capacity (1000-2000 req/sec):** **ACHIEVED**
 
 ### Scaling Recommendations
 
@@ -150,7 +150,7 @@ For **higher loads** (2000+ req/sec), you can:
 
 ## Conclusion
 
-✅ **Gunicorn configuration is working perfectly!**
+**Gunicorn configuration is working perfectly!**
 
 - All services are handling **1,200-1,450 requests/second**
 - Response times are excellent (12-15ms average)
diff --git a/REDIS_SETUP.md b/REDIS_SETUP.md
@@ -68,7 +68,7 @@ user = redis_helper.get_cached_user(123)
 | Service | Redis DB | Purpose |
 |---------|----------|---------|
 | User Management | 0 | User cache, sessions, rate limiting |
-| Booking | 1 | Booking cache, flight availability |
+| Task Processing | 1 | Task cache, task data |
 | Notification | 2 | Notification cache |
 
 ## Common Use Cases
@@ -176,7 +176,7 @@ KEYS user:*
 
 # Booking keys (DB 1)
 docker-compose exec redis redis-cli -n 1
-KEYS booking:*
+KEYS booking:*  # Legacy key pattern (task processing uses this pattern)
 ```
 
 ## Next Steps
diff --git a/VAULT_SETUP.md b/VAULT_SETUP.md
@@ -112,7 +112,7 @@ secret/
  db # Database credentials
  jwt # JWT secret key
  kafka # Kafka credentials
- booking/
+ taskprocessing/
  db
  external-api
  notification/
diff --git a/docs/log_monitoring_system.md b/docs/log_monitoring_system.md
@@ -1 +1 @@
-# Real-Time Log Monitoring System ## Overview A production-ready real-time log monitoring system that: - **Collects logs** from multiple microservices - **Streams logs** to Apache Kafka - **Filters errors** in real-time - **Provides dashboard** for visualization - **Integrates with Grafana** for advanced monitoring ## Architecture ``` Microservices (User Mgmt, Booking, etc) Logs via Kafka Handler Apache Kafka Topics: - application- logs - application- logs-errors Consumer filters errors Log Monitor Service - Error Store - API Endpoints - Dashboard Dashboard (HTML) Grafana Integration REST API ``` ## Components ### 1. Kafka Log Handler **Location**: `common/pyportal_common/logging_handlers/kafka_log_handler.py` - Custom Python logging handler - Automatically sends logs to Kafka - Separates errors into `application-logs-errors` topic - Includes metadata: service name, host, timestamp, etc. ### 2. Log Monitor Service **Location**: `services/logmonitor/` - Consumes logs from Kafka - Filters ERROR and CRITICAL level logs - Stores errors in-memory (can be extended to database) - Provides REST API for dashboard - Grafana-compatible endpoints ### 3. Dashboard **Location**: `services/logmonitor/app/dashboard.html` - Real-time error visualization - Auto-refresh capability - Statistics display - Error details view ## Setup ### 1. Start Services ```bash # Start all services including log monitor docker-compose up -d # Or start just log monitor docker-compose up -d logmonitor-service ``` ### 2. Access Dashboard Open browser: http://localhost:5004 ### 3. Verify Log Collection ```bash # Check if logs are being sent to Kafka docker-compose exec kafka kafka-console-consumer \ --bootstrap-server localhost:9092 \ --topic application-logs \ --from-beginning # Check error logs docker-compose exec kafka kafka-console-consumer \ --bootstrap-server localhost:9092 \ --topic application-logs-errors \ --from-beginning ``` ## API Endpoints ### Get Error Logs ```bash GET /api/v1/logs/errors?limit=100&service=usermanagement ``` ### Get Statistics ```bash GET /api/v1/logs/stats ``` ### Get Services ```bash GET /api/v1/logs/services ``` ### Get Service-Specific Errors ```bash GET /api/v1/logs/errors/usermanagement?limit=50 ``` ## Grafana Integration ### Setup Grafana Data Source 1. **Add Data Source in Grafana:** - Type: JSON API - URL: http://localhost:5004/api/v1/grafana - Access: Server (default) 2. **Query Endpoints:** - Search: `/api/v1/grafana/search` - Query: `/api/v1/grafana/query` - Annotations: `/api/v1/grafana/annotations` ### Example Grafana Queries **Error Count:** ```json { "target": "error_count" } ``` **Errors by Service:** ```json { "target": "error_by_service" } ``` **Errors by Level:** ```json { "target": "error_by_level" } ``` ## How It Works ### 1. Log Collection Each microservice automatically sends logs to Kafka: ```python # In service __init__.py from common.pyportal_common.logging_handlers.base_logger import LogMonitor # Logger automatically configured with Kafka handler logger = LogMonitor("usermanagement").logger # All log calls go to Kafka logger.info("User created") logger.error("Database connection failed") # Goes to error topic ``` ### 2. Error Filtering Log Monitor Service consumes from Kafka: ```python # Consumes from: # - application-logs-errors (pre-filtered) # - application-logs (filters for ERROR/CRITICAL) # Stores errors in ErrorLogStore error_store.add_error(log_data) ``` ### 3. Dashboard Display Dashboard polls API every 5 seconds: ```javascript // Auto-refresh setInterval(() => { fetch('/api/v1/logs/errors?limit=50') .then(res => res.json()) .then(data => updateErrors(data.errors)); }, 5000); ``` ## Configuration ### Environment Variables **Log Monitor Service:** ```bash LOG_MONITOR_SERVER_PORT=9094 KAFKA_BOOTSTRAP_SERVERS=kafka:29092 SERVICE_NAME=logmonitor ``` **Microservices (for Kafka logging):** ```bash KAFKA_BOOTSTRAP_SERVERS=kafka:29092 SERVICE_NAME=usermanagement # or booking, notification, etc. HOSTNAME=usermanagement-service ``` ## Features ### Real-Time Monitoring - Logs streamed to Kafka in real-time - Dashboard auto-refreshes every 5 seconds - No polling delays ### Error Filtering - Automatic filtering of ERROR and CRITICAL logs - Separate Kafka topic for errors - Efficient processing ### Multi-Service Support - Collects logs from all microservices - Service identification in logs - Per-service error statistics ### Grafana Ready - Compatible with Grafana JSON API data source - Time series data support - Annotation support for events ### Production Ready - Error handling and resilience - Connection pooling - Graceful degradation ## Extending the System ### Store Errors in Database Replace in-memory store with database: ```python # In kafka_consumer.py from app.models.error_log import ErrorLog def add_error_to_db(error_log): error = ErrorLog( timestamp=error_log['timestamp'], level=error_log['level'], service=error_log['service'], message=error_log['message'] ) db.session.add(error) db.session.commit() ``` ### Add Alerting ```python # In kafka_consumer.py def check_and_alert(error_log): if error_log['level'] == 'CRITICAL': send_alert_email(error_log) send_slack_notification(error_log) ``` ### Add Log Retention ```python # Clean old errors def cleanup_old_errors(): cutoff = datetime.utcnow() - timedelta(days=7) ErrorLog.query.filter(ErrorLog.timestamp < cutoff).delete() db.session.commit() ``` ## Monitoring Best Practices 1. **Set Appropriate Log Levels** - Use ERROR for recoverable errors - Use CRITICAL for system failures 2. **Include Context** - Service name - Request ID - User ID (if applicable) 3. **Monitor Dashboard Regularly** - Check for error spikes - Identify problematic services - Track error trends 4. **Set Up Alerts** - Critical error threshold - Error rate threshold - Service-specific alerts ## Troubleshooting ### Logs Not Appearing 1. Check Kafka connection: ```bash docker-compose exec kafka kafka-topics --list --bootstrap-server localhost:9092 ``` 2. Check log monitor service: ```bash docker-compose logs logmonitor-service ``` 3. Verify environment variables: ```bash docker-compose exec usermanagement-service env | grep KAFKA ``` ### Dashboard Not Loading 1. Check service is running: ```bash docker-compose ps logmonitor-service ``` 2. Check API endpoint: ```bash curl http://localhost:5004/api/v1/logs/stats ``` ## Performance Considerations - **Kafka Topics**: Separate topics for errors improve filtering - **Batch Processing**: Consider batching log writes - **Storage**: In-memory store is fast but limited; use DB for production - **Rate Limiting**: Monitor Kafka throughput ## Security - **Authentication**: Add authentication to API endpoints - **Authorization**: Restrict dashboard access - **Encryption**: Use TLS for Kafka in production - **Log Sanitization**: Remove sensitive data from logs ## Next Steps 1. System is implemented and ready 2. Add database persistence for errors 3. Implement alerting (email/Slack) 4. Add log retention policies 5. Set up Grafana dashboards 6. Add authentication to dashboard 
+# Real-Time Log Monitoring System ## Overview A production-ready real-time log monitoring system that: - **Collects logs** from multiple microservices - **Streams logs** to Apache Kafka - **Filters errors** in real-time - **Provides dashboard** for visualization - **Integrates with Grafana** for advanced monitoring ## Architecture ``` Microservices (User Mgmt, Task Processing, Notification, etc) Logs via Kafka Handler Apache Kafka Topics: - application- logs - application- logs-errors Consumer filters errors Log Monitor Service - Error Store - API Endpoints - Dashboard Dashboard (HTML) Grafana Integration REST API ``` ## Components ### 1. Kafka Log Handler **Location**: `common/pyportal_common/logging_handlers/kafka_log_handler.py` - Custom Python logging handler - Automatically sends logs to Kafka - Separates errors into `application-logs-errors` topic - Includes metadata: service name, host, timestamp, etc. ### 2. Log Monitor Service **Location**: `services/logmonitor/` - Consumes logs from Kafka - Filters ERROR and CRITICAL level logs - Stores errors in-memory (can be extended to database) - Provides REST API for dashboard - Grafana-compatible endpoints ### 3. Dashboard **Location**: `services/logmonitor/app/dashboard.html` - Real-time error visualization - Auto-refresh capability - Statistics display - Error details view ## Setup ### 1. Start Services ```bash # Start all services including log monitor docker-compose up -d # Or start just log monitor docker-compose up -d logmonitor-service ``` ### 2. Access Dashboard Open browser: http://localhost:5004 ### 3. Verify Log Collection ```bash # Check if logs are being sent to Kafka docker-compose exec kafka kafka-console-consumer \ --bootstrap-server localhost:9092 \ --topic application-logs \ --from-beginning # Check error logs docker-compose exec kafka kafka-console-consumer \ --bootstrap-server localhost:9092 \ --topic application-logs-errors \ --from-beginning ``` ## API Endpoints ### Get Error Logs ```bash GET /api/v1/logs/errors?limit=100&service=usermanagement ``` ### Get Statistics ```bash GET /api/v1/logs/stats ``` ### Get Services ```bash GET /api/v1/logs/services ``` ### Get Service-Specific Errors ```bash GET /api/v1/logs/errors/usermanagement?limit=50 ``` ## Grafana Integration ### Setup Grafana Data Source 1. **Add Data Source in Grafana:** - Type: JSON API - URL: http://localhost:5004/api/v1/grafana - Access: Server (default) 2. **Query Endpoints:** - Search: `/api/v1/grafana/search` - Query: `/api/v1/grafana/query` - Annotations: `/api/v1/grafana/annotations` ### Example Grafana Queries **Error Count:** ```json { "target": "error_count" } ``` **Errors by Service:** ```json { "target": "error_by_service" } ``` **Errors by Level:** ```json { "target": "error_by_level" } ``` ## How It Works ### 1. Log Collection Each microservice automatically sends logs to Kafka: ```python # In service __init__.py from common.pyportal_common.logging_handlers.base_logger import LogMonitor # Logger automatically configured with Kafka handler logger = LogMonitor("usermanagement").logger # All log calls go to Kafka logger.info("User created") logger.error("Database connection failed") # Goes to error topic ``` ### 2. Error Filtering Log Monitor Service consumes from Kafka: ```python # Consumes from: # - application-logs-errors (pre-filtered) # - application-logs (filters for ERROR/CRITICAL) # Stores errors in ErrorLogStore error_store.add_error(log_data) ``` ### 3. Dashboard Display Dashboard polls API every 5 seconds: ```javascript // Auto-refresh setInterval(() => { fetch('/api/v1/logs/errors?limit=50') .then(res => res.json()) .then(data => updateErrors(data.errors)); }, 5000); ``` ## Configuration ### Environment Variables **Log Monitor Service:** ```bash LOG_MONITOR_SERVER_PORT=9094 KAFKA_BOOTSTRAP_SERVERS=kafka:29092 SERVICE_NAME=logmonitor ``` **Microservices (for Kafka logging):** ```bash KAFKA_BOOTSTRAP_SERVERS=kafka:29092 SERVICE_NAME=usermanagement # or booking, notification, etc. HOSTNAME=usermanagement-service ``` ## Features ### Real-Time Monitoring - Logs streamed to Kafka in real-time - Dashboard auto-refreshes every 5 seconds - No polling delays ### Error Filtering - Automatic filtering of ERROR and CRITICAL logs - Separate Kafka topic for errors - Efficient processing ### Multi-Service Support - Collects logs from all microservices - Service identification in logs - Per-service error statistics ### Grafana Ready - Compatible with Grafana JSON API data source - Time series data support - Annotation support for events ### Production Ready - Error handling and resilience - Connection pooling - Graceful degradation ## Extending the System ### Store Errors in Database Replace in-memory store with database: ```python # In kafka_consumer.py from app.models.error_log import ErrorLog def add_error_to_db(error_log): error = ErrorLog( timestamp=error_log['timestamp'], level=error_log['level'], service=error_log['service'], message=error_log['message'] ) db.session.add(error) db.session.commit() ``` ### Add Alerting ```python # In kafka_consumer.py def check_and_alert(error_log): if error_log['level'] == 'CRITICAL': send_alert_email(error_log) send_slack_notification(error_log) ``` ### Add Log Retention ```python # Clean old errors def cleanup_old_errors(): cutoff = datetime.utcnow() - timedelta(days=7) ErrorLog.query.filter(ErrorLog.timestamp < cutoff).delete() db.session.commit() ``` ## Monitoring Best Practices 1. **Set Appropriate Log Levels** - Use ERROR for recoverable errors - Use CRITICAL for system failures 2. **Include Context** - Service name - Request ID - User ID (if applicable) 3. **Monitor Dashboard Regularly** - Check for error spikes - Identify problematic services - Track error trends 4. **Set Up Alerts** - Critical error threshold - Error rate threshold - Service-specific alerts ## Troubleshooting ### Logs Not Appearing 1. Check Kafka connection: ```bash docker-compose exec kafka kafka-topics --list --bootstrap-server localhost:9092 ``` 2. Check log monitor service: ```bash docker-compose logs logmonitor-service ``` 3. Verify environment variables: ```bash docker-compose exec usermanagement-service env | grep KAFKA ``` ### Dashboard Not Loading 1. Check service is running: ```bash docker-compose ps logmonitor-service ``` 2. Check API endpoint: ```bash curl http://localhost:5004/api/v1/logs/stats ``` ## Performance Considerations - **Kafka Topics**: Separate topics for errors improve filtering - **Batch Processing**: Consider batching log writes - **Storage**: In-memory store is fast but limited; use DB for production - **Rate Limiting**: Monitor Kafka throughput ## Security - **Authentication**: Add authentication to API endpoints - **Authorization**: Restrict dashboard access - **Encryption**: Use TLS for Kafka in production - **Log Sanitization**: Remove sensitive data from logs ## Next Steps 1. System is implemented and ready 2. Add database persistence for errors 3. Implement alerting (email/Slack) 4. Add log retention policies 5. Set up Grafana dashboards 6. Add authentication to dashboard 
diff --git a/docs/redis_integration.md b/docs/redis_integration.md
@@ -25,7 +25,7 @@ Each service uses a separate Redis database to avoid key conflicts:
 | Service | Redis DB | Usage |
 |---------|----------|-------|
 | User Management | 0 | User cache, sessions, rate limiting |
-| Booking | 1 | Booking cache, flight availability |
+| Task Processing | 1 | Task cache, task data (legacy: BookingRedisHelper) |
 | Notification | 2 | Notification cache, delivery status |
 
 ## Usage Examples
@@ -77,28 +77,27 @@ is_allowed, remaining = redis_helper.check_rate_limit(
 )
 ```
 
-### Booking Service
+### Task Processing Service
 
 ```python
-from app.redis_helper import BookingRedisHelper
+from app.redis_helper import BookingRedisHelper  # Legacy class name, used for task processing
 
 # Initialize helper
 redis_helper = BookingRedisHelper()
 
-# Cache booking
-booking_data = {
-    'bookingId': 456,
+# Cache task data
+task_data = {
+    'taskId': 456,
     'userId': 123,
-    'flightId': 789,
-    'numberOfSeats': 2,
-    'status': 'confirmed'
+    'status': 'processing',
+    'details': {...}
 }
-redis_helper.cache_booking(456, booking_data, ttl=3600)
+redis_helper.cache_booking(456, task_data, ttl=3600)  # Legacy method name
 
-# Get cached booking
-cached_booking = redis_helper.get_cached_booking(456)
+# Get cached task
+cached_task = redis_helper.get_cached_booking(456)  # Legacy method name
 
-# Cache flight availability
+# Cache task-related data
 redis_helper.cache_flight_availability(
     flight_id=789,
     available_seats=10,
@@ -152,7 +151,7 @@ REDIS_PASSWORD=           # Redis password (optional)
 Each service uses its own Redis database:
 
 - **User Management**: `REDIS_DB=0`
-- **Booking**: `REDIS_DB=1`
+- **Task Processing**: `REDIS_DB=1`
 - **Notification**: `REDIS_DB=2`
 
 ## Caching Strategies
@@ -290,8 +289,8 @@ Use consistent key naming for easier management:
 - **Users**: `user:{user_id}`
 - **User Lookup**: `user:lookup:username:{username}`, `user:lookup:email:{email}`
 - **Sessions**: `session:{session_id}`
-- **Bookings**: `booking:{booking_id}`
-- **User Bookings**: `user:bookings:{user_id}`
+- **Tasks**: `booking:{task_id}` (legacy key pattern, used for task processing)
+- **User Tasks**: `user:bookings:{user_id}` (legacy key pattern)
 - **Flight Availability**: `flight:availability:{flight_id}`
 - **Rate Limits**: `rate_limit:{service}:{identifier}`
 

Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		-# Real-Time Log Monitoring System ## Overview A production-ready real-time log monitoring system that: - Collects logs from multiple microservices - Streams logs to Apache Kafka - Filters errors in real-time - Provides dashboard for visualization - Integrates with Grafana for advanced monitoring ## Architecture ``` Microservices (User Mgmt, Booking, etc) Logs via Kafka Handler Apache Kafka Topics: - application- logs - application- logs-errors Consumer filters errors Log Monitor Service - Error Store - API Endpoints - Dashboard Dashboard (HTML) Grafana Integration REST API ``` ## Components ### 1. Kafka Log Handler Location: `common/pyportal_common/logging_handlers/kafka_log_handler.py` - Custom Python logging handler - Automatically sends logs to Kafka - Separates errors into `application-logs-errors` topic - Includes metadata: service name, host, timestamp, etc. ### 2. Log Monitor Service Location: `services/logmonitor/` - Consumes logs from Kafka - Filters ERROR and CRITICAL level logs - Stores errors in-memory (can be extended to database) - Provides REST API for dashboard - Grafana-compatible endpoints ### 3. Dashboard Location: `services/logmonitor/app/dashboard.html` - Real-time error visualization - Auto-refresh capability - Statistics display - Error details view ## Setup ### 1. Start Services ```bash # Start all services including log monitor docker-compose up -d # Or start just log monitor docker-compose up -d logmonitor-service ``` ### 2. Access Dashboard Open browser: http://localhost:5004 ### 3. Verify Log Collection ```bash # Check if logs are being sent to Kafka docker-compose exec kafka kafka-console-consumer \ --bootstrap-server localhost:9092 \ --topic application-logs \ --from-beginning # Check error logs docker-compose exec kafka kafka-console-consumer \ --bootstrap-server localhost:9092 \ --topic application-logs-errors \ --from-beginning ``` ## API Endpoints ### Get Error Logs ```bash GET /api/v1/logs/errors?limit=100&service=usermanagement ``` ### Get Statistics ```bash GET /api/v1/logs/stats ``` ### Get Services ```bash GET /api/v1/logs/services ``` ### Get Service-Specific Errors ```bash GET /api/v1/logs/errors/usermanagement?limit=50 ``` ## Grafana Integration ### Setup Grafana Data Source 1. Add Data Source in Grafana: - Type: JSON API - URL: http://localhost:5004/api/v1/grafana - Access: Server (default) 2. Query Endpoints: - Search: `/api/v1/grafana/search` - Query: `/api/v1/grafana/query` - Annotations: `/api/v1/grafana/annotations` ### Example Grafana Queries Error Count: ```json { "target": "error_count" } ``` Errors by Service: ```json { "target": "error_by_service" } ``` Errors by Level: ```json { "target": "error_by_level" } ``` ## How It Works ### 1. Log Collection Each microservice automatically sends logs to Kafka: ```python # In service __init__.py from common.pyportal_common.logging_handlers.base_logger import LogMonitor # Logger automatically configured with Kafka handler logger = LogMonitor("usermanagement").logger # All log calls go to Kafka logger.info("User created") logger.error("Database connection failed") # Goes to error topic ``` ### 2. Error Filtering Log Monitor Service consumes from Kafka: ```python # Consumes from: # - application-logs-errors (pre-filtered) # - application-logs (filters for ERROR/CRITICAL) # Stores errors in ErrorLogStore error_store.add_error(log_data) ``` ### 3. Dashboard Display Dashboard polls API every 5 seconds: ```javascript // Auto-refresh setInterval(() => { fetch('/api/v1/logs/errors?limit=50') .then(res => res.json()) .then(data => updateErrors(data.errors)); }, 5000); ``` ## Configuration ### Environment Variables Log Monitor Service: ```bash LOG_MONITOR_SERVER_PORT=9094 KAFKA_BOOTSTRAP_SERVERS=kafka:29092 SERVICE_NAME=logmonitor ``` Microservices (for Kafka logging): ```bash KAFKA_BOOTSTRAP_SERVERS=kafka:29092 SERVICE_NAME=usermanagement # or booking, notification, etc. HOSTNAME=usermanagement-service ``` ## Features ### Real-Time Monitoring - Logs streamed to Kafka in real-time - Dashboard auto-refreshes every 5 seconds - No polling delays ### Error Filtering - Automatic filtering of ERROR and CRITICAL logs - Separate Kafka topic for errors - Efficient processing ### Multi-Service Support - Collects logs from all microservices - Service identification in logs - Per-service error statistics ### Grafana Ready - Compatible with Grafana JSON API data source - Time series data support - Annotation support for events ### Production Ready - Error handling and resilience - Connection pooling - Graceful degradation ## Extending the System ### Store Errors in Database Replace in-memory store with database: ```python # In kafka_consumer.py from app.models.error_log import ErrorLog def add_error_to_db(error_log): error = ErrorLog( timestamp=error_log['timestamp'], level=error_log['level'], service=error_log['service'], message=error_log['message'] ) db.session.add(error) db.session.commit() ``` ### Add Alerting ```python # In kafka_consumer.py def check_and_alert(error_log): if error_log['level'] == 'CRITICAL': send_alert_email(error_log) send_slack_notification(error_log) ``` ### Add Log Retention ```python # Clean old errors def cleanup_old_errors(): cutoff = datetime.utcnow() - timedelta(days=7) ErrorLog.query.filter(ErrorLog.timestamp < cutoff).delete() db.session.commit() ``` ## Monitoring Best Practices 1. Set Appropriate Log Levels - Use ERROR for recoverable errors - Use CRITICAL for system failures 2. Include Context - Service name - Request ID - User ID (if applicable) 3. Monitor Dashboard Regularly - Check for error spikes - Identify problematic services - Track error trends 4. Set Up Alerts - Critical error threshold - Error rate threshold - Service-specific alerts ## Troubleshooting ### Logs Not Appearing 1. Check Kafka connection: ```bash docker-compose exec kafka kafka-topics --list --bootstrap-server localhost:9092 ``` 2. Check log monitor service: ```bash docker-compose logs logmonitor-service ``` 3. Verify environment variables: ```bash docker-compose exec usermanagement-service env \| grep KAFKA ``` ### Dashboard Not Loading 1. Check service is running: ```bash docker-compose ps logmonitor-service ``` 2. Check API endpoint: ```bash curl http://localhost:5004/api/v1/logs/stats ``` ## Performance Considerations - Kafka Topics: Separate topics for errors improve filtering - Batch Processing: Consider batching log writes - Storage: In-memory store is fast but limited; use DB for production - Rate Limiting: Monitor Kafka throughput ## Security - Authentication: Add authentication to API endpoints - Authorization: Restrict dashboard access - Encryption: Use TLS for Kafka in production - Log Sanitization: Remove sensitive data from logs ## Next Steps 1. System is implemented and ready 2. Add database persistence for errors 3. Implement alerting (email/Slack) 4. Add log retention policies 5. Set up Grafana dashboards 6. Add authentication to dashboard
	`1`	+# Real-Time Log Monitoring System ## Overview A production-ready real-time log monitoring system that: - Collects logs from multiple microservices - Streams logs to Apache Kafka - Filters errors in real-time - Provides dashboard for visualization - Integrates with Grafana for advanced monitoring ## Architecture ``` Microservices (User Mgmt, Task Processing, Notification, etc) Logs via Kafka Handler Apache Kafka Topics: - application- logs - application- logs-errors Consumer filters errors Log Monitor Service - Error Store - API Endpoints - Dashboard Dashboard (HTML) Grafana Integration REST API ``` ## Components ### 1. Kafka Log Handler Location: `common/pyportal_common/logging_handlers/kafka_log_handler.py` - Custom Python logging handler - Automatically sends logs to Kafka - Separates errors into `application-logs-errors` topic - Includes metadata: service name, host, timestamp, etc. ### 2. Log Monitor Service Location: `services/logmonitor/` - Consumes logs from Kafka - Filters ERROR and CRITICAL level logs - Stores errors in-memory (can be extended to database) - Provides REST API for dashboard - Grafana-compatible endpoints ### 3. Dashboard Location: `services/logmonitor/app/dashboard.html` - Real-time error visualization - Auto-refresh capability - Statistics display - Error details view ## Setup ### 1. Start Services ```bash # Start all services including log monitor docker-compose up -d # Or start just log monitor docker-compose up -d logmonitor-service ``` ### 2. Access Dashboard Open browser: http://localhost:5004 ### 3. Verify Log Collection ```bash # Check if logs are being sent to Kafka docker-compose exec kafka kafka-console-consumer \ --bootstrap-server localhost:9092 \ --topic application-logs \ --from-beginning # Check error logs docker-compose exec kafka kafka-console-consumer \ --bootstrap-server localhost:9092 \ --topic application-logs-errors \ --from-beginning ``` ## API Endpoints ### Get Error Logs ```bash GET /api/v1/logs/errors?limit=100&service=usermanagement ``` ### Get Statistics ```bash GET /api/v1/logs/stats ``` ### Get Services ```bash GET /api/v1/logs/services ``` ### Get Service-Specific Errors ```bash GET /api/v1/logs/errors/usermanagement?limit=50 ``` ## Grafana Integration ### Setup Grafana Data Source 1. Add Data Source in Grafana: - Type: JSON API - URL: http://localhost:5004/api/v1/grafana - Access: Server (default) 2. Query Endpoints: - Search: `/api/v1/grafana/search` - Query: `/api/v1/grafana/query` - Annotations: `/api/v1/grafana/annotations` ### Example Grafana Queries Error Count: ```json { "target": "error_count" } ``` Errors by Service: ```json { "target": "error_by_service" } ``` Errors by Level: ```json { "target": "error_by_level" } ``` ## How It Works ### 1. Log Collection Each microservice automatically sends logs to Kafka: ```python # In service __init__.py from common.pyportal_common.logging_handlers.base_logger import LogMonitor # Logger automatically configured with Kafka handler logger = LogMonitor("usermanagement").logger # All log calls go to Kafka logger.info("User created") logger.error("Database connection failed") # Goes to error topic ``` ### 2. Error Filtering Log Monitor Service consumes from Kafka: ```python # Consumes from: # - application-logs-errors (pre-filtered) # - application-logs (filters for ERROR/CRITICAL) # Stores errors in ErrorLogStore error_store.add_error(log_data) ``` ### 3. Dashboard Display Dashboard polls API every 5 seconds: ```javascript // Auto-refresh setInterval(() => { fetch('/api/v1/logs/errors?limit=50') .then(res => res.json()) .then(data => updateErrors(data.errors)); }, 5000); ``` ## Configuration ### Environment Variables Log Monitor Service: ```bash LOG_MONITOR_SERVER_PORT=9094 KAFKA_BOOTSTRAP_SERVERS=kafka:29092 SERVICE_NAME=logmonitor ``` Microservices (for Kafka logging): ```bash KAFKA_BOOTSTRAP_SERVERS=kafka:29092 SERVICE_NAME=usermanagement # or booking, notification, etc. HOSTNAME=usermanagement-service ``` ## Features ### Real-Time Monitoring - Logs streamed to Kafka in real-time - Dashboard auto-refreshes every 5 seconds - No polling delays ### Error Filtering - Automatic filtering of ERROR and CRITICAL logs - Separate Kafka topic for errors - Efficient processing ### Multi-Service Support - Collects logs from all microservices - Service identification in logs - Per-service error statistics ### Grafana Ready - Compatible with Grafana JSON API data source - Time series data support - Annotation support for events ### Production Ready - Error handling and resilience - Connection pooling - Graceful degradation ## Extending the System ### Store Errors in Database Replace in-memory store with database: ```python # In kafka_consumer.py from app.models.error_log import ErrorLog def add_error_to_db(error_log): error = ErrorLog( timestamp=error_log['timestamp'], level=error_log['level'], service=error_log['service'], message=error_log['message'] ) db.session.add(error) db.session.commit() ``` ### Add Alerting ```python # In kafka_consumer.py def check_and_alert(error_log): if error_log['level'] == 'CRITICAL': send_alert_email(error_log) send_slack_notification(error_log) ``` ### Add Log Retention ```python # Clean old errors def cleanup_old_errors(): cutoff = datetime.utcnow() - timedelta(days=7) ErrorLog.query.filter(ErrorLog.timestamp < cutoff).delete() db.session.commit() ``` ## Monitoring Best Practices 1. Set Appropriate Log Levels - Use ERROR for recoverable errors - Use CRITICAL for system failures 2. Include Context - Service name - Request ID - User ID (if applicable) 3. Monitor Dashboard Regularly - Check for error spikes - Identify problematic services - Track error trends 4. Set Up Alerts - Critical error threshold - Error rate threshold - Service-specific alerts ## Troubleshooting ### Logs Not Appearing 1. Check Kafka connection: ```bash docker-compose exec kafka kafka-topics --list --bootstrap-server localhost:9092 ``` 2. Check log monitor service: ```bash docker-compose logs logmonitor-service ``` 3. Verify environment variables: ```bash docker-compose exec usermanagement-service env \| grep KAFKA ``` ### Dashboard Not Loading 1. Check service is running: ```bash docker-compose ps logmonitor-service ``` 2. Check API endpoint: ```bash curl http://localhost:5004/api/v1/logs/stats ``` ## Performance Considerations - Kafka Topics: Separate topics for errors improve filtering - Batch Processing: Consider batching log writes - Storage: In-memory store is fast but limited; use DB for production - Rate Limiting: Monitor Kafka throughput ## Security - Authentication: Add authentication to API endpoints - Authorization: Restrict dashboard access - Encryption: Use TLS for Kafka in production - Log Sanitization: Remove sensitive data from logs ## Next Steps 1. System is implemented and ready 2. Add database persistence for errors 3. Implement alerting (email/Slack) 4. Add log retention policies 5. Set up Grafana dashboards 6. Add authentication to dashboard