A real-time voice AI agent that handles inbound phone calls for a restaurant. Callers are greeted by Aria, an AI assistant that takes food and drink orders and answers common questions about the restaurant.
When a customer calls the Twilio phone number, the system:
- Receives the call via a Twilio webhook and returns TwiML that opens a media stream
- Streams the caller's audio (mulaw 8kHz) to Deepgram for real-time speech-to-text
- Sends each finalised transcript to OpenAI GPT-4o-mini to generate a response
- Converts the response text to speech via Rime.ai TTS
- Transcodes the audio to mulaw 8kHz and streams it back to the caller through Twilio
When the customer confirms their order, the LLM emits a structured ORDER_COMPLETE signal which the agent parses and logs.
Inbound call
|
Twilio
| webhook POST /twillio/incoming-call
| <-- TwiML: Connect <Stream url="wss://.../twillio/media-stream" />
|
WebSocket /twillio/media-stream (FastAPI)
|
|-- audio frames (mulaw 8kHz) --> Deepgram WebSocket (STT)
| |
| final transcript
| |
| ConversationAgent (OpenAI gpt-4o-mini)
| |
| response text
| |
| Rime.ai TTS API
| |
| WAV --> miniaudio resample --> audioop mulaw
| |
|<------- mulaw 8kHz audio chunks ------+
|
Twilio plays audio to caller
| Path | Responsibility |
|---|---|
src/api.py |
FastAPI app entry point, registers router and middleware |
src/routers/twillio.py |
Twilio webhook + WebSocket handler, orchestrates the full call pipeline |
src/core/agent.py |
ConversationAgent — wraps OpenAI chat, tracks conversation history and order state |
src/core/restaurant.py |
Menu, FAQ, system prompt builder, and Order / OrderItem data classes |
src/services/tts.py |
Calls Rime.ai TTS, decodes and resamples audio to mulaw 8kHz for Twilio |
src/core/settings.py |
Loads API keys and config from environment variables |
src/middleware/logging.py |
Request logging middleware and uvicorn logger setup |
| Package | Purpose |
|---|---|
fastapi |
Web framework |
uvicorn[standard] |
ASGI server |
websockets |
WebSocket client for Deepgram |
deepgram-sdk |
Deepgram STT (Nova-2 model) |
openai |
OpenAI chat completions (gpt-4o-mini) |
httpx |
Async HTTP client for Rime TTS API |
miniaudio |
Audio decoding and resampling |
audioop-lts |
Linear16 PCM to mulaw encoding |
twilio |
Twilio helper library |
python-dotenv |
Load environment variables from .env |
loguru |
Structured logging |
certifi |
SSL certificate bundle |
- Python 3.14+
uv(package manager)- A publicly reachable URL for Twilio webhooks — use ngrok or similar during development
uv syncCreate a .env file in the project root:
RIME_API_KEY=your_rime_api_key
OPENAI_AUTH=your_openai_api_key
DEEPGRAM_AUTH=your_deepgram_api_key
TWILLIO_AUTH=your_twilio_auth_token
TWILLIO_ACCOUNT_SID=your_twilio_account_sid
# Optional — post call data (transcript + order) to this URL when a call ends.
# Use the built-in receiver while developing:
WEBHOOK_URL=http://localhost:8000/webhook/call-complete
# Optional — path to a JSON config file defining the agent persona, menu, and FAQ.
# Defaults to the built-in Ristorante Bella config if not set.
AGENT_CONFIG_PATH=agent_config.jsoncd src
python api.pyThe server starts on http://0.0.0.0:8000.
ngrok http 8000Copy the HTTPS forwarding URL (e.g. https://abc123.ngrok.io).
In your Twilio Console, set the Voice webhook for your phone number to:
https://<your-ngrok-url>/twillio/incoming-call
HTTP method: POST (or GET).
| Method | Path | Description |
|---|---|---|
GET/POST |
/twillio/incoming-call |
Twilio voice webhook — returns TwiML to open a media stream |
WebSocket |
/twillio/media-stream |
Bidirectional audio stream between Twilio and the agent |
GET |
/health |
Health check — returns {"message": "OK"} |
| Variable | Description |
|---|---|
RIME_API_KEY |
Rime.ai API key for text-to-speech |
OPENAI_AUTH |
OpenAI API key for gpt-4o-mini |
DEEPGRAM_AUTH |
Deepgram API key for speech-to-text |
TWILLIO_AUTH |
Twilio auth token |
TWILLIO_ACCOUNT_SID |
Twilio account SID |
WEBHOOK_URL |
URL to POST call data to after each call (optional) |
AGENT_CONFIG_PATH |
Path to a JSON agent config file (optional, see agent_config.json) |