An intelligent task management agent deployed on Google Cloud Run with WhatsApp interface, powered by LangGraph and GPT-4o-mini. Features Plan-Execute architecture for complex multi-step requests, natural language date parsing, multi-user support, and Google Calendar integration. Built to demonstrate production-ready Agentic AI engineering skills.
Live Service: https://ai-task-agent-kbimuakj2a-uc.a.run.app
Try the WhatsApp bot now!
- Text: +1 (415) 523-8886
- Send:
join [your-sandbox-code](get code from Twilio console) - Try:
remind me to buy milk tomorrow at 2pm
Service Status: ✅ Live on Google Cloud Run (us-central1)
- 🧠 Plan-Execute Architecture - Agent breaks down complex requests into multi-step plans (NEW!)
- 🚀 Production Deployment - Fully deployed on Google Cloud Run with HTTPS endpoints
- 💬 WhatsApp Interface - Natural conversational UI via Twilio WhatsApp API
- 🔄 Advanced Agent Patterns - ReAct loop with planning, reflection, and state management
- 🗄️ Cloud-Native Storage - SQLite databases synced to Cloud Storage
- 🌍 Multi-User Support - Isolated task lists per user with phone number hashing
- ⏰ Smart Date Parsing - "tomorrow at 2pm", "next Friday", "in 3 hours"
- 🔒 Production Security - Webhook signature verification, rate limiting (10 msg/min)
- 📊 Observability - LangSmith tracing for debugging and monitoring
┌─────────────┐ ┌──────────────────────┐ ┌──────────────────┐
│ WhatsApp │─────▶│ Cloud Run │─────▶│ Cloud Storage │
│ (Twilio) │◀─────│ FastAPI + LangGraph │ │ (SQLite DBs) │
└─────────────┘ └──────────────────────┘ └──────────────────┘
│
▼
┌──────────────┐
│ GPT-4o-mini │
│ + Tools │
└──────────────┘
│
┌────────┴────────┐
▼ ▼
┌───────────┐ ┌──────────┐
│ Redis │ │ Google │
│ (Limits) │ │ Calendar │
└───────────┘ └──────────┘
Data Flow:
- User sends WhatsApp message → Twilio webhook
- Cloud Run receives POST → verifies signature → sends ACK
- LangGraph agent processes message → calls tools
- Tools interact with database/calendar
- Response sent back via Twilio Messages API
- Databases synced to Cloud Storage on shutdown
| Component | Technology | Purpose |
|---|---|---|
| Agent Framework | LangGraph | State management, tool orchestration, checkpointing |
| LLM | GPT-4o-mini | Natural language understanding, tool selection |
| Backend | FastAPI | Async webhook endpoints, background processing |
| Database | SQLite + Cloud Storage | Task persistence, conversation memory |
| Messaging | Twilio WhatsApp API | User interface, webhook integration |
| Deployment | Google Cloud Run | Serverless container hosting, auto-scaling |
| Rate Limiting | Redis Cloud | 10 messages/min per user |
| Observability | LangSmith | Agent tracing, debugging, performance monitoring |
| CI/CD | GitHub Actions | Automated testing (planned) |
git clone https://github.com/boemer00/my-agent.git
cd my-agent
pip install -r requirements.txtCreate .env file:
# Required
OPENAI_API_KEY=your_openai_key_here
# Optional - Observability
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your_langsmith_key_here
LANGCHAIN_PROJECT=my-todo-agent
# Optional - WhatsApp (for local webhook testing)
TWILIO_ACCOUNT_SID=ACxxxxx
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_WHATSAPP_NUMBER=whatsapp:+14155238886CLI mode (terminal interface):
python app.pyAPI mode (WhatsApp webhook):
uvicorn api.main:app --reload --port 8080Expose local webhook (for Twilio testing):
ngrok http 8080
# Update Twilio webhook to: https://your-ngrok-url.ngrok.io/whatsapp/webhookUser: remind me to buy kombucha tomorrow at 2pm
Agent: ✓ Reminder set: 'buy kombucha' for Thursday, October 31, 2025 at 02:00 PM
User: show my tasks
Agent: Your tasks:
1. buy kombucha (Due: tomorrow at 2pm)
User: mark 1 as done
Agent: ✓ Marked task #1 as done: 'buy kombucha'
User: organize my tasks for this week
Agent: [Internal] Creating plan...
📋 Plan:
1. List all current tasks
2. Check which tasks have due dates
3. Prioritize tasks by deadline
4. Suggest a schedule for the week
Agent: Let me help organize your week. First, let me see what you have...
[Executes: list_tasks()]
I found 5 tasks:
- Project report (Due: Nov 4, 2pm) 🔴 URGENT
- Review PRs (Due: Nov 4, 4pm)
- Buy groceries (Due: Nov 5, 2pm)
- Call dentist (no deadline)
- Email team (no deadline)
Agent: [Reflection: Step 1 complete → Moving to step 2]
Agent: Now let me prioritize by urgency...
Here's your organized week:
**Monday (Nov 4):**
- 2pm: Finish project report ⚡
- 4pm: Review pull requests
**Tuesday (Nov 5):**
- 2pm: Buy groceries
- Evening: Call dentist, email team
You have 2 urgent tasks today! Would you like me to set reminders?
Key Difference: Complex requests trigger the Plan-Execute pattern, where the agent creates a multi-step plan and systematically works through it with reflection after each step.
Test Coverage: 121 tests | 70% coverage | <4s runtime
# All tests
pytest
# With coverage
pytest --cov
# Specific categories
pytest tests/test_agent_flows.py # Integration tests
pytest tests/test_tools.py # Tool unit tests
pytest tests/test_database.py # Repository tests
pytest tests/test_date_parser.py # Date parsing teststests/
├── conftest.py # Shared fixtures, test configuration
├── test_agent_flows.py # End-to-end agent tests (8 tests)
├── test_database.py # Database/Repository tests (14 tests)
├── test_date_parser.py # Date utility tests (13 tests)
└── test_tools.py # Tool function tests (15 tests)
Key Testing Patterns:
- ✅ Mocked external APIs (Google Calendar, OpenAI) for fast tests
- ✅ Isolated test databases (in-memory SQLite)
- ✅ Time-freezing for predictable date parsing tests
- ✅ Pytest fixtures for setup/teardown
Current State: Google Calendar sync works locally with single account Goal: Each WhatsApp user syncs with their own Google Calendar
Right now, all users would share one Google Calendar (privacy issue). Production needs per-user OAuth where each person authorizes their own calendar.
1. OAuth Flow Integration (~2 hours)
- Add
/auth/googleendpoint to initiate user authorization - Generate unique authorization URLs per user
- Handle OAuth callback and token exchange
- Send authorization link via WhatsApp on first reminder
2. Token Storage (~1 hour)
- Store user tokens in Cloud Storage:
gs://bucket/user_tokens/{user_id}_token.json - Implement token refresh logic with expiry handling
- Graceful degradation if user hasn't authorized
3. Secret Management (~1 hour)
- Move
credentials.jsonto Google Secret Manager - Configure Cloud Run to access secrets
- Remove credentials from container image
4. Calendar Service Updates (~2 hours)
- Modify
get_calendar_service(user_id)to load user-specific tokens - Update all calendar functions to accept
user_id - Add error handling for missing/expired tokens
5. UX Flow (~1 hour)
User: "remind me to call mom tomorrow"
Agent: "To sync with your Google Calendar, please authorize:
https://ai-task-agent-xxx.run.app/auth/google?user_id=abc123"
[User clicks, authorizes]
Agent: "✅ Calendar connected! Creating reminder..."
Timeline: 6-8 hours of focused development Benefits: True multi-tenant support, production-ready OAuth, showcase architectural evolution
my-agent/
├── agent/
│ ├── graph.py # LangGraph workflow with Plan-Execute pattern
│ ├── nodes.py # Agent, planner, reflection, tools nodes
│ ├── state.py # State schema (messages, user_id, plan, plan_step)
│ └── prompts.py # System prompts
├── api/
│ ├── main.py # FastAPI app entry point
│ ├── routes/
│ │ ├── whatsapp.py # Webhook endpoints
│ │ └── health.py # Health check
│ ├── services/
│ │ └── message_handler.py # Async message processing
│ └── schemas/
│ └── whatsapp.py # Pydantic models
├── database/
│ ├── models.py # SQLite schema
│ ├── repository.py # Data access layer
│ └── cloud_storage.py # GCS sync utilities
├── tools/
│ ├── tasks.py # Task CRUD tools
│ ├── google_calendar.py # Calendar integration
│ └── __init__.py
├── utils/
│ └── date_parser.py # Natural language date parsing
├── config/
│ └── settings.py # Environment config
├── tests/ # 121 tests, 70% coverage (includes planning tests)
├── docs/ # Setup guides
├── app.py # CLI entry point
├── deploy.sh # Cloud Run deployment script
├── Dockerfile # Multi-stage build
└── requirements.txt # Dependencies
Agent Architecture & Advanced Patterns 🆕
- "Explain your Plan-Execute implementation" → Complex requests trigger planner node → LLM creates numbered plan → agent executes step-by-step → reflection node tracks progress → repeats until plan complete. Simple requests bypass planning for efficiency.
- "Why Plan-Execute over simple ReAct?" → Handles multi-step goals (e.g., "organize my week"), improves task decomposition, shows structured thinking. Demonstrates understanding of advanced agentic patterns beyond basic tool calling.
- "How does reflection work?" → After each tool execution, reflection node checks: (1) Did we complete current step? (2) Move to next step or finish? (3) Clear plan when done. Keeps agent focused on structured goals.
- "Show me the agent flow" → START → should_plan() router → [planner OR agent] → agent → tools → should_reflect() router → [reflection OR agent] → loop until END. Conditional routing based on request complexity and plan state.
Architecture & Design
- "Why LangGraph over pure LLM calls?" → State persistence, checkpointing for conversation memory, built-in tool calling, conditional routing, Plan-Execute pattern support
- "Explain the ReAct pattern" → Reasoning (LLM thinks) → Acting (execute tools) → Observation (tool results) → repeat until done. Enhanced with planning for complex requests.
- "How does Cloud Run handle statelessness?" → Databases synced to Cloud Storage on startup/shutdown, ephemeral containers, checkpointer maintains conversation state
Production Considerations
- "How do you handle Cloud Run cold starts?" → First message gets "Working on it" acknowledgment within 100ms, then full response after agent processing
- "What's your security model?" → Webhook signature verification (HMAC-SHA1), rate limiting (10/min), API key in env vars, phone number hashing
- "How would you scale this?" → Horizontal scaling (Cloud Run auto-scales), database connection pooling, async processing, queue for high load
OAuth & Calendar Integration
- "Why not implement per-user OAuth yet?" → MVP prioritization - focused on core agent + deployment first. Calendar works locally for demos. Phase 2 adds multi-tenant OAuth.
- "Explain OAuth 2.0 flow" → Authorization code flow: redirect to Google → user consents → callback with code → exchange for tokens → store refresh token
- "How do you handle token expiry?" → Refresh tokens automatically refresh access tokens when expired, graceful degradation if refresh fails
Technical Decisions
- "Why SQLite instead of PostgreSQL?" → Simple MVP, <10K users, Cloud Storage sync works well, easy migration path to Cloud SQL later
- "Why Twilio sandbox vs WhatsApp Business API?" → Faster iteration (5 min setup vs 2 week approval), free for demo, production would use Business API
- "How do you test agent behavior?" → Mock LLM responses for deterministic tests, integration tests with real LangGraph, LangSmith for production tracing
- Google Calendar Setup - OAuth 2.0 configuration guide
- Deployment Guide - Step-by-step Cloud Run deployment (if exists)
- Monitoring Guide - LangSmith setup and best practices
Built by Renato Boemer as a portfolio project to demonstrate AI engineering skills.
- GitHub: @boemer00
- LinkedIn: Renato Boemer
Technologies: LangGraph, LangChain, FastAPI, Google Cloud Run, Twilio, OpenAI
Questions? Check the LangGraph docs or open an issue!