-
Notifications
You must be signed in to change notification settings - Fork 65
Description
@awsapm can you help triage this issue to find the root cause
Root Cause Analysis
Primary Issue: Data Inconsistency in Nutrition Service
-
High Error Rate: The nutrition-service-nodejs is experiencing a 65.22% error rate on the GET /nutrition/:pet_type endpoint, returning 404 Not Found errors consistently.
-
Database Query Failures: The service is successfully connecting to MongoDB (test|mongodb|27017) but failing to find nutrition data for specific pet types, resulting in 404 responses.
-
Agent Fallback Behavior: When the Bedrock nutrition agent (using Claude 3.5 Haiku) can't retrieve your clinic's specific nutrition products via the get_dietary_restrictions tool, it's likely falling back to its general knowledge base and
recommending generic nutritional products.
Key Findings from Telemetry Data
• Nutrition Service Issues:
• 65.22% error rate on pet-specific nutrition queries
• Latency spikes up to 161ms on successful queries
• Database connection working but data retrieval failing
• Agent Performance:
• 50+ second latency on agent invocations
• Successfully calling tools but getting empty/error responses
• Using Claude 3.5 Haiku model for recommendations
Immediate Action Items
-
Fix Nutrition Database:
• Verify your nutrition product catalog is properly loaded in MongoDB
• Check if pet type parameters are correctly mapped in the database
• Ensure the nutrition service can find products for all supported pet types -
Update Agent Configuration:
• Configure the agent to only recommend products from your verified inventory
• Add validation to prevent recommendations when nutrition service returns errors
• Implement fallback messaging when product data is unavailable -
Monitor Data Pipeline:
• Set up alerts for nutrition service 404 errors
• Implement health checks for the nutrition database
• Add logging to track which pet types are failing lookups