A cloud-native Python microservice implementing the UPEE (Understand β Plan β Execute β Evaluate) loop for intelligent chat interactions with multi-provider LLM support.
- Python 3.11+
- pip (or Poetry)
-
Clone the repository
git clone https://github.com/your-org/paf-core-agent.git cd paf-core-agent -
Create and activate virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Set up environment variables
Create a
.envfile in the root directory:# Required: At least one LLM provider API key OPENAI_API_KEY=sk-your-openai-key-here # ANTHROPIC_API_KEY=sk-ant-your-anthropic-key-here # AWS_REGION=us-east-1 # For AWS Bedrock # Optional: Configuration DEBUG=true DEFAULT_MODEL=gpt-4o MAX_CONTEXT_TOKENS=4000
-
Install file processing dependencies (optional)
# For Excel/CSV file processing pip install pandas openpyxl # For additional file types pip install python-docx PyPDF2 pillow
-
Start the development server
chmod +x scripts/start.sh ./scripts/start.sh
Or manually:
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
Once the server is running, test it:
# Basic health check
curl http://localhost:8000/api/health
# Chat test
curl -X POST http://localhost:8000/api/chat/stream \
-H "Content-Type: application/json" \
-d '{
"message": "Hello! Can you help me analyze data?",
"show_thinking": true,
"model": "gpt-4o"
}'The service will be available at:
- API: http://localhost:8000
- Interactive Documentation: http://localhost:8000/docs
- Health Check: http://localhost:8000/api/health
- Debug Tools: http://localhost:8000/api/debug/inspect-request
Required:
- Python 3.11+
- At least one LLM provider API key (OpenAI, Anthropic, or AWS Bedrock)
Optional:
- File processing libraries (pandas, openpyxl) for Excel/CSV support
- Docker for containerized deployment
The core cognitive loop consists of four phases:
- Understand - Parse and analyze user input with context
- Plan - Develop response strategy and identify required resources
- Execute - Generate response using appropriate LLM providers
- Evaluate - Assess response quality and refine if needed
- π Server-Sent Events (SSE) - Real-time streaming chat responses
- π§ Multi-Provider LLM - OpenAI, Anthropic Claude, AWS Bedrock support
- π File Context - Intelligent file processing and summarization
- π gRPC Integration - Communication with downstream worker agents
- π Observability - Structured logging, Prometheus metrics, AWS X-Ray tracing
- π Security - JWT/HMAC authentication, mTLS for gRPC
- π³ Container Ready - Docker support with optimized image size
- βοΈ Cloud Native - AWS Fargate deployment with auto-scaling
POST /api/chat/stream
Content-Type: application/json
{
"message": "Hello, how can you help me?",
"show_thinking": true,
"files": [...],
"model": "gpt-4",
"temperature": 0.7
}Response: Server-Sent Events stream with:
thinkingevents (UPEE phase insights)contentevents (response chunks)completeevent (metadata and stats)doneevent (stream termination)
GET /api/healthReturns service health status including LLM provider availability.
GET /api/chat/modelsLists all available LLM models and their status.
paf-core-agent/
βββ app/
β βββ api/ # FastAPI routers
β βββ core/ # UPEE logic
β βββ llm_providers/ # Multi-provider abstraction
β βββ grpc_clients/ # gRPC client implementations
β βββ utils/ # Utilities (logging, auth, metrics)
β βββ schemas.py # Pydantic models
β βββ settings.py # Configuration
βββ tests/ # Test suites
βββ scripts/ # Development scripts
βββ proto/ # Protocol buffer definitions
βββ requirements.txt # Python dependencies
| Variable | Description | Required | Default |
|---|---|---|---|
OPENAI_API_KEY |
OpenAI API key | At least one provider | - |
ANTHROPIC_API_KEY |
Anthropic API key | At least one provider | - |
AWS_REGION |
AWS region for Bedrock | No | us-east-1 |
DEBUG |
Enable debug mode | No | false |
MAX_CONTEXT_TOKENS |
Maximum context window | No | 4000 |
DEFAULT_MODEL |
Default LLM model | No | gpt-4o |
pytest tests/ -v --cov=app# Format code
black app/ tests/
# Sort imports
isort app/ tests/
# Lint
flake8 app/ tests/
# Type checking
mypy app/Build and run with Docker:
# Build image
docker build -t paf-core-agent .
# Run container
docker run -p 8000:8000 \
-e OPENAI_API_KEY=your_key_here \
paf-core-agentThe service is designed for deployment on AWS Fargate with:
- Application Load Balancer for HTTP/HTTPS traffic
- Auto Scaling based on CPU/memory metrics
- ECS service with health checks
- CloudWatch logging and monitoring
See terraform/ directory for Infrastructure as Code examples.
For production deployment:
- Use AWS Secrets Manager for API keys
- Configure VPC with private subnets for gRPC traffic
- Set up CloudWatch dashboards for monitoring
- Enable AWS X-Ray for distributed tracing
The service exposes Prometheus metrics at /metrics:
- Request latency and throughput
- Token usage per provider
- UPEE phase timing
- Error rates and types
Structured JSON logs include:
- Request tracing with correlation IDs
- UPEE phase events
- LLM provider calls
- Performance metrics
/api/health- Comprehensive health status/api/health/live- Liveness probe/api/health/ready- Readiness probe
Configure multiple providers for redundancy and cost optimization:
# Environment variables
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
AWS_REGION=us-east-1
# Default routing
DEFAULT_MODEL=gpt-3.5-turboMAX_CONCURRENT_REQUESTS=150- Concurrent request limitREQUEST_TIMEOUT=30- Request timeout in secondsMAX_CONTEXT_TOKENS=4000- Context window size
- Authentication: HMAC signatures or JWT tokens
- Transport: HTTPS for client traffic, mTLS for gRPC
- Secrets: AWS Secrets Manager integration
- Network: VPC isolation for inter-service communication
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Run quality checks
- Submit a pull request
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0) - see the LICENSE file for details.
For questions and support:
- Create an issue in the repository
- Check the documentation at
/docs - Review the health status at
/api/health
β
Core UPEE Loop - Fully implemented with streaming support
β
Multi-Provider LLM - OpenAI, Anthropic
β
File Processing - Excel, CSV, and text file support with agentic processing
β
Memory Support - Short-term conversation history
β
Streaming Chat - Real-time Server-Sent Events
β
Debug Tools - Request inspection and troubleshooting endpoints
β
Health Monitoring - Comprehensive health checks and metrics
Status: β Production Ready - Core functionality complete