This document provides essential guidelines for AI agents working on this LangGraph FastAPI Agent project.
This is a production-ready AI agent application built with:
- LangGraph for stateful, multi-step AI agent workflows
- FastAPI for high-performance async REST API endpoints
- Langfuse for LLM observability and tracing
- PostgreSQL + pgvector for long-term memory storage (mem0ai)
- JWT authentication with session management
- Prometheus + Grafana for monitoring
- All imports MUST be at the top of the file - never add imports inside functions or classes
- Use structlog for all logging
- Log messages must be lowercase_with_underscores (e.g.,
"user_login_successful") - NO f-strings in structlog events - pass variables as kwargs
- Use
logger.exception()instead oflogger.error()to preserve tracebacks - Example:
logger.info("chat_request_received", session_id=session.id, message_count=len(messages))
- Always use tenacity library for retry logic
- Configure with exponential backoff
- Example:
@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10))
- Always enable rich library for formatted console outputs
- Use rich for progress bars, tables, panels, and formatted text
- Only cache successful responses, never cache errors
- Use appropriate cache TTL based on data volatility
- All routes must have rate limiting decorators
- Use dependency injection for services, database connections, and auth
- All database operations must be async
- Use
async deffor asynchronous operations - Use type hints for all function signatures
- Prefer Pydantic models over raw dictionaries
- Use functional, declarative programming; avoid classes except for services and agents
- File naming: lowercase with underscores (e.g.,
user_routes.py) - Use the RORO pattern (Receive an Object, Return an Object)
- Handle errors at the beginning of functions
- Use early returns for error conditions
- Place the happy path last in the function
- Use guard clauses for preconditions
- Use
HTTPExceptionfor expected errors with appropriate status codes
- Use
StateGraphfor building AI agent workflows - Define clear state schemas using Pydantic models (see
app/schemas/graph.py) - Use
CompiledStateGraphfor production workflows - Implement
AsyncPostgresSaverfor checkpointing and persistence - Use
Commandfor controlling graph flow between nodes
- Use LangChain's
CallbackHandlerfrom Langfuse for tracing all LLM calls - All LLM operations must have Langfuse tracing enabled
- Use
AsyncMemoryfor semantic memory storage - Store memories per user_id for personalized experiences
- Use async methods:
add(),get(),search(),delete()
- Use JWT tokens for authentication
- Implement session-based user management (see
app/api/v1/auth.py) - Use
get_current_sessiondependency for protected endpoints - Store sensitive data in environment variables
- Validate all user inputs with Pydantic models
- Use SQLModel for ORM models (combines SQLAlchemy + Pydantic)
- Define models in
app/models/directory - Use async database operations with asyncpg
- Use LangGraph's AsyncPostgresSaver for agent checkpointing
- Minimize blocking I/O operations
- Use async for all database and external API calls
- Implement caching for frequently accessed data
- Use connection pooling for database connections
- Optimize LLM calls with streaming responses
- Integrate Langfuse for LLM tracing on all agent operations
- Export Prometheus metrics for API performance
- Use structured logging with context binding (request_id, session_id, user_id)
- Track LLM inference duration, token usage, and costs
- Implement metric-based evaluations for LLM outputs (see
evals/directory) - Create custom evaluation metrics as markdown files in
evals/metrics/prompts/ - Use Langfuse traces for evaluation data sources
- Generate JSON reports with success rates
- Use environment-specific configuration files (
.env.development,.env.staging,.env.production) - Use Pydantic Settings for type-safe configuration (see
app/core/config.py) - Never hardcode secrets or API keys
- FastAPI - Web framework
- LangGraph - Agent workflow orchestration
- LangChain - LLM abstraction and tools
- Langfuse - LLM observability and tracing
- Pydantic v2 - Data validation and settings
- structlog - Structured logging
- mem0ai - Long-term memory management
- PostgreSQL + pgvector - Database and vector storage
- SQLModel - ORM for database models
- tenacity - Retry logic
- rich - Terminal formatting
- slowapi - Rate limiting
- prometheus-client - Metrics collection
- All routes must have rate limiting decorators
- All LLM operations must have Langfuse tracing
- All async operations must have proper error handling
- All logs must follow structured logging format with lowercase_underscore event names
- All retries must use tenacity library
- All console outputs should use rich formatting
- All caching should only store successful responses
- All imports must be at the top of files
- All database operations must be async
- All endpoints must have proper type hints and Pydantic models
- ❌ Using f-strings in structlog events
- ❌ Adding imports inside functions
- ❌ Forgetting rate limiting decorators on routes
- ❌ Missing Langfuse tracing on LLM calls
- ❌ Caching error responses
- ❌ Using
logger.error()instead oflogger.exception()for exceptions - ❌ Blocking I/O operations without async
- ❌ Hardcoding secrets or API keys
- ❌ Missing type hints on function signatures
Before modifying code:
- Read the existing implementation first
- Check for related patterns in the codebase
- Ensure consistency with existing code style
- Add appropriate logging with structured format
- Include error handling with early returns
- Add type hints and Pydantic models
- Verify Langfuse tracing is enabled for LLM calls
- LangGraph Documentation: https://langchain-ai.github.io/langgraph/
- LangChain Documentation: https://python.langchain.com/docs/
- FastAPI Documentation: https://fastapi.tiangolo.com/
- Langfuse Documentation: https://langfuse.com/docs