WhatsApp WebRTC Bot

A real-time voice bot that integrates with WhatsApp Business API to handle voice calls using WebRTC technology. Users can call your WhatsApp Business number and have natural conversations with an AI-powered bot.

📋 Table of Contents

Overview
Architecture
Prerequisites
Installation
Configuration
Running the Server
API Endpoints
Environment Variables
Troubleshooting
MCP Integration
Development
License

Overview

This project enables real-time voice conversations with WhatsApp users through WebRTC technology. The bot handles incoming WhatsApp calls, processes user speech using STT, processes queries with an LLM, and responds using TTS. The system is built on the Pipecat framework for real-time voice and multimodal AI agents.

Key Features

Real-time voice conversations with WhatsApp users
WebRTC technology for voice communication
Integration with AI services for conversation handling
Support for STT (Speech-to-Text), LLM (Large Language Model), and TTS (Text-to-Speech)
MCP (Model Context Protocol) integration for extended tool access
Automatic greeting and order-taking capabilities
Stock checking and order placement functionality

Architecture

The application follows a modular architecture:

Server Layer: FastAPI application that handles WhatsApp webhook events and manages WebRTC connections
Transport Layer: SmallWebRTCTransport handles WebRTC connection establishment and management
AI Layer: Uses OpenAI-compatible LLM for conversation processing and MCP for tool integration
Audio Processing: Uses Deepgram STT, Cartesia TTS, and Silero VAD for voice activity detection

Components

server.py: Handles WhatsApp webhook events and manages WebRTC connections
bot.py: Implements the conversation pipeline using Pipecat
prompt.py: Contains the system and session instructions for the AI agent
env.example: Example environment variables file

Prerequisites

WhatsApp Business API Setup

Facebook Account: Create an account at facebook.com
Facebook Developer Account: Create an account at developers.facebook.com
WhatsApp Business App: Create a new WhatsApp Business API application
Phone Number: Add and verify a WhatsApp Business phone number
Business Verification: Complete business verification process (required for production only)
Webhook Configuration: Set up webhook endpoint for your application

Important Note: For production, make sure your WhatsApp Business account has access to this feature.

Find more details here:

Getting Started Guide

Voice Calling Documentation

Webhooks Setup

WhatsApp Business API Configuration

Enable Voice Calls

Your WhatsApp Business phone number must be configured to accept voice calls[2]:

For development, you'll be provided with a free test phone number valid for 90 days.

Go to your WhatsApp Business API dashboard in Meta Developer Console
Navigate to Configuration → Phone Numbers → Manage phone numbers
Select your phone number
In the Calls tab, enable "Allow voice calls" capability
Save the configuration

Configure Webhook

Set up your webhook endpoint in the Meta Developer Console[3]:

Go to WhatsApp → Configuration → Webhooks
Set callback URL: https://your-domain.com/
Set verify token: your_webhook_verification_token
- This token should match your WHATSAPP_WEBHOOK_VERIFICATION_TOKEN environment variable
Click "Verify and save"
In the webhook fields below, select: calls (required for voice call events)

Configure Access Token

Go to WhatsApp → API Setup
Click "Generate access token"
- Use this token for your WHATSAPP_TOKEN environment variable
Note your Phone Number ID - you'll need this for PHONE_NUMBER_ID configuration

Installation

Requirements

Python 3.10 or newer
uv package manager (https://docs.astral.sh/uv/)

Setup Commands

Clone the repository:
```
git clone <repository-url>
cd whatsapp
```
Install dependencies:
```
uv sync
```
Activate the virtual environment:
```
source .venv/bin/activate
```

Configuration

Environment Variables

Copy the example environment file:
```
cp env.example .env
```
Edit .env file and add your API keys and configuration values.

Required API Keys

OpenRouter API Key: Required for LLM service. Get it from OpenRouter.
- Used as OPENROUTER_API_KEY in environment
- Used with the google/gemini-2.0-flash-lite-001 model by default
Deepgram API Key: Required for Speech-to-Text service. Get it from Deepgram.
- Used as DEEPGRAM_API_KEY in environment
Cartesia API Key: Required for Text-to-Speech service. Get it from Cartesia.
- Used as CARTESIA_API_KEY in environment
WhatsApp Business API Token: Required for WhatsApp integration.
- Used as WHATSAPP_TOKEN in environment
WhatsApp Webhook Verification Token: Required for webhook verification.
- Used as WHATSAPP_WEBHOOK_VERIFICATION_TOKEN in environment
WhatsApp Phone Number ID: Your WhatsApp Business phone number ID.
- Used as WHATSAPP_PHONE_NUMBER_ID in environment

Optional: MCP (Model Context Protocol) Integration

To use MCP for extended tool access:

MCP HTTP URL: Optional. Provide an HTTP URL for your MCP server. The MCP_HTTP_URL environment variable should be set to your MCP server URL.

Running the Server

Start the Server

python server.py

The server will start and listen for incoming WhatsApp webhook events.

By default, the server will run on localhost:7860. You can specify a different host and port:

python server.py --host 0.0.0.0 --port 8080

Add -v flag for verbose logging:

python server.py -v

Connect Using WhatsApp

Find your WhatsApp test number in the Meta Developer Console
Call the number from your WhatsApp app
The bot should answer and engage in conversation

API Endpoints

GET `/`

Handles WhatsApp webhook verification requests from Meta.

Description: Used during webhook setup to verify the endpoint
Parameters: Verification token and challenge as query parameters
Response: Challenge string for verification

POST `/`

Handles incoming WhatsApp webhook events.

Description: Processes incoming WhatsApp messages and call events
Body: WhatsApp webhook request payload
Response: Success confirmation with processing status

Environment Variables

Variable	Required	Description
`OPENROUTER_API_KEY`	Yes	API key for OpenRouter
`DEEPGRAM_API_KEY`	Yes	API key for Deepgram STT service
`CARTESIA_API_KEY`	Yes	API key for Cartesia TTS service
`WHATSAPP_TOKEN`	Yes	WhatsApp Business API access token
`WHATSAPP_WEBHOOK_VERIFICATION_TOKEN`	Yes	Token for webhook verification
`WHATSAPP_PHONE_NUMBER_ID`	Yes	WhatsApp Business phone number ID
`MCP_HTTP_URL`	No	Optional HTTP URL for MCP server to add tools to the AI agent

Troubleshooting

Common Issues

Verify the WHATSAPP_WEBHOOK_VERIFICATION_TOKEN matches the one in Meta Developer Console
Ensure your webhook URL is publicly accessible (consider using ngrok for local development)

Bot Not Responding
- Check that all API keys are correctly configured in the .env file
- Verify that voice calling is enabled for your WhatsApp Business number
- Ensure your business account is verified for production use
Dependency Installation Issues
- Ensure you have uv package manager installed
- Run uv sync to install all dependencies
- Activate the virtual environment with source .venv/bin/activate
WebRTC Connection Issues
- Check that your server can handle WebSocket connections
- Verify that your domain is accessible via HTTPS (required for WebRTC)

Debugging

Add the -v flag when running the server for more detailed logs:

python server.py -v

This will enable TRACE-level logging which helps identify issues with the connection flow.

MCP Integration

This bot supports MCP (Model Context Protocol) for extended tool access. If the MCP_HTTP_URL environment variable is set, the bot will use the MCP server to access custom tools.

MCP allows the bot to access custom tools and services that extend its capabilities. The current implementation includes:

doolally_knowledge_base: For answering questions about menu items, timings, location, or FAQs
check_menu_stock: For verifying if a specific item is in stock and confirming the latest price
place_order: For recording a confirmed order in the Orders tab

Development

Project Structure

whatsapp/
├── bot.py              # Conversation pipeline implementation
├── server.py           # FastAPI server for handling webhooks
├── prompt.py           # System and session instructions for the AI agent
├── env.example         # Example environment variables
├── pyproject.toml      # Project dependencies and configuration
├── README.md           # This file
└── ...

Making Changes

Make sure you're in the project directory
Activate the virtual environment:
```
source .venv/bin/activate
```
Make your changes
Test the changes by running:
```
python server.py
```

Pipecat Framework

This project uses the Pipecat framework for building real-time voice agents. The pipeline includes:

Input transport (WebRTC connection)
STT (Speech-to-Text)
LLM (Large Language Model)
TTS (Text-to-Speech)
Output transport (WebRTC connection)

Documentation References

License

This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.

Notes

Voice calling feature requires WhatsApp Business API access
Test numbers are valid for 90 days in development mode
Production deployment requires business verification
The bot is configured as "Reva" for Doolally Taproom with specific conversation flows and prompts

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
bot.py		bot.py
env.example		env.example
n8n-mcp.json		n8n-mcp.json
prompt.py		prompt.py
pyproject.toml		pyproject.toml
server.py		server.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

WhatsApp WebRTC Bot

📋 Table of Contents

Overview

Key Features

Architecture

Components

Prerequisites

WhatsApp Business API Setup

WhatsApp Business API Configuration

Enable Voice Calls

Configure Webhook

Configure Access Token

Installation

Requirements

Setup Commands

Configuration

Environment Variables

Required API Keys

Optional: MCP (Model Context Protocol) Integration

Running the Server

Start the Server

Connect Using WhatsApp

API Endpoints

GET /

POST /

Environment Variables

Troubleshooting

Common Issues

Debugging

MCP Integration

Development

Project Structure

Making Changes

Pipecat Framework

Documentation References

License

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

GET `/`

POST `/`

Packages