A fully functional conversational AI system for restaurant phone ordering
This project demonstrates an end-to-end voice AI ordering system for restaurants. Customers can call a phone number, have a natural conversation with an AI agent to browse the menu, place orders, and simulate payment โ all through voice.
- ๐ Twilio Voice Integration - Handle incoming phone calls with Twilio Programmable Voice
- ๐ค Speech-to-Text - Transcribe customer speech using OpenAI Whisper
- ๐ง LLM Processing - Natural language understanding with Mistral/Qwen2-Audio via Hugging Face
- ๐ RAG Menu Search - Semantic search over menu items using FAISS vector database
- ๐ฃ๏ธ Text-to-Speech - Generate voice responses with gTTS or ElevenLabs
- ๐ณ Simulated POS/Payments - Mock Stripe integration and order management
- ๐ Multi-turn Conversations - Session state management with Redis/in-memory storage
- ๐ Analytics & Logging - Call transcripts, order analytics, and error tracking
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Phone Call โโโโโโถโ Twilio Voice โโโโโโถโ FastAPI Server โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โผ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โ Whisper โ โ LLM (Mistral) โ โ FAISS RAG โ โ
โ โ (STT) โโโโโถโ Intent + NLU โโโโโถโ Menu Search โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โ gTTS/11Labs โ โ Session Manager โ โ SQLite DB โ โ
โ โ (TTS) โ โ (Redis/Mem) โ โ Orders/Logs โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
- Python 3.10+
- Twilio Account (free trial available)
- ngrok for local development
- Hugging Face Account with API token
git clone https://github.com/KOKOSde/voice-ai-ordering-agent.git
cd voice-ai-ordering-agentpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txtcp .env.example .env
# Edit .env with your credentialsRequired variables:
TWILIO_ACCOUNT_SID=your_account_sid
TWILIO_AUTH_TOKEN=your_auth_token
HUGGINGFACE_TOKEN=your_hf_tokenuvicorn main:app --reload --host 0.0.0.0 --port 8000In a new terminal:
ngrok http 8000Copy the ngrok URL (e.g., https://abc123.ngrok.io)
- Go to Twilio Console
- Navigate to Phone Numbers โ Manage โ Active Numbers
- Click on your phone number
- Under Voice & Fax, set:
- A CALL COMES IN: Webhook โ
https://your-ngrok-url.ngrok.io/voice - HTTP POST
- A CALL COMES IN: Webhook โ
- Save
Call your Twilio phone number and start ordering!
voice-ai-ordering-agent/
โโโ main.py # FastAPI application & Twilio webhooks
โโโ menu.json # Restaurant menu database
โโโ prompts.py # LLM prompt templates
โโโ requirements.txt # Python dependencies
โโโ .env.example # Environment variables template
โโโ .gitignore # Git ignore patterns
โโโ Procfile # Heroku deployment
โโโ utils/
โ โโโ __init__.py
โ โโโ transcription.py # Whisper speech-to-text
โ โโโ rag.py # FAISS vector search for menu
โ โโโ tts.py # Text-to-speech (gTTS/ElevenLabs)
โ โโโ llm.py # LLM processing (Mistral/OpenAI)
โ โโโ session.py # Conversation state management
โ โโโ database.py # SQLite order storage
โ โโโ payment.py # Simulated payment processing
โโโ tests/
โ โโโ test_main.py # Unit tests
โโโ .github/
โโโ workflows/
โโโ ci.yml # GitHub Actions CI
Here's a sample conversation flow:
๐ *Phone rings*
๐ค AI: "Welcome to Bella's Italian Kitchen! I'm your AI ordering assistant.
I can help you browse our menu and place an order.
What would you like today?"
๐ค Customer: "What pizzas do you have?"
๐ค AI: "We have some great pizzas! Our most popular is the Margherita at $14.99,
the classic Pepperoni at $16.99, and our chef's special Truffle Mushroom
at $22.99. Would you like to hear more about any of these?"
๐ค Customer: "I'll have a large pepperoni pizza"
๐ค AI: "Great choice! I've added a large Pepperoni Pizza for $24.99 to your order.
Would you like anything else? Maybe some garlic bread or a drink?"
๐ค Customer: "Add garlic bread and that's it"
๐ค AI: "Perfect! I've added Garlic Bread for $5.99. Let me confirm your order:
- Large Pepperoni Pizza: $24.99
- Garlic Bread: $5.99
Your total is $30.98 plus tax. Is that correct?"
๐ค Customer: "Yes, that's correct"
๐ค AI: "Your order has been confirmed! Your order number is BIK-A7B3C2D1.
Your total is $33.51 including tax. Thank you for ordering with
Bella's Italian Kitchen! Have a great day!"
| Model | Description | Memory Required |
|---|---|---|
mistralai/Mistral-7B-Instruct-v0.2 |
Default, good balance | ~16GB |
Qwen/Qwen2-Audio-7B-Instruct |
Audio-native model | ~16GB |
distilgpt2 |
Local testing only | ~500MB |
| OpenAI API | Use OPENAI_API_KEY |
Cloud-based |
| Engine | Quality | Cost |
|---|---|---|
| gTTS | Good | Free |
| ElevenLabs | Excellent | Paid API |
| pyttsx3 | Basic (offline) | Free |
| Backend | Use Case |
|---|---|
| In-memory | Development, single instance |
| Redis | Production, distributed |
# Login to Heroku
heroku login
# Create app
heroku create your-app-name
# Set environment variables
heroku config:set TWILIO_ACCOUNT_SID=xxx
heroku config:set TWILIO_AUTH_TOKEN=xxx
heroku config:set HUGGINGFACE_TOKEN=xxx
# Deploy
git push heroku mainFROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]Both platforms auto-detect Python projects. Just connect your GitHub repo!
| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Health check |
/voice |
POST | Twilio voice webhook |
/voice/process |
POST | Process speech input |
/sms |
POST | SMS webhook |
/menu |
GET | Get full menu JSON |
/order/{id} |
GET | Get order status |
/analytics |
GET | Call/order analytics |
# Run all tests
pytest
# Run with coverage
pytest --cov=. --cov-report=html
# Run specific test file
pytest tests/test_main.py -vThe system logs all interactions for analytics:
- Call duration and outcomes
- Popular menu items
- Order values and trends
- Customer satisfaction signals
Access via /analytics endpoint.
- Never commit
.envfiles - Use Twilio request validation in production
- Rate limit API endpoints
- Sanitize all user inputs
- Use HTTPS in production
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Twilio for voice/SMS APIs
- OpenAI Whisper for transcription
- Hugging Face for transformer models
- FastAPI for the web framework
- FAISS for vector search