Skip to content

Latest commit

 

History

History
192 lines (129 loc) · 5.58 KB

File metadata and controls

192 lines (129 loc) · 5.58 KB

Pipecat Voice AI Customer Support Bot with Moss (Gemini Edition)

Real-time voice AI assistant powered by Google Gemini and Moss semantic search. Speak to it, get instant voice responses powered by your knowledge base.

🔧 Setup

1. Install uv (if not already installed)

# Install uv package manager
curl -LsSf https://astral.sh/uv/install.sh | sh

# Add to PATH (restart shell or run this)
source $HOME/.local/bin/env

2. Install Dependencies

# Clone the repository (if not already done)
git clone <repository-url>
cd moss-gemini-pipecat-demo

# Create and activate virtual environment (Python 3.10+)
python3.13 -m venv .venv  # On Windows: py -3.13 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install all dependencies including inferedge-moss
pip install -r requirements.txt



3. Get API Keys

Required services:

4. Configure .env

# Core configuration
# Google Gemini configuration
GOOGLE_API_KEY=your_google_api_key_here

DEEPGRAM_API_KEY=your_key_here
CARTESIA_API_KEY=your_key_here

# Moss configuration (semantic search)
MOSS_PROJECT_ID=your_project_id
MOSS_PROJECT_KEY=your_api_key
MOSS_INDEX_NAME=your_knowledge_base

5. Run

# Launch the Gemini assistant
uv run python bot-gemini.py

# After the bot is running, open the local WebRTC client
open http://localhost:8080/client

🧰 Useful Bash Commands

Environment & Dependencies

# Create a virtual environment (Python 3.10+)
python3.10 -m venv .venv  # On Windows: py -3.10 -m venv .venv

# Activate it (macOS/Linux)
source .venv/bin/activate

# Activate it (Windows PowerShell)
.venv\Scripts\Activate.ps1

# Copy the example environment file and fill in secrets
cp .env.template .env

# Recreate the venv with Python 3.10 (optional reset)

rm -rf .venv
python3.10 -m venv .venv  # adjust path if python3.10 lives elsewhere
source .venv/bin/activate
python --version    # should report Python 3.10.x

# Use the venv-managed interpreter for subsequent commands
python -m pip install --upgrade pip

# Install Python requirements
pip install -r requirements.txt

# Sync with uv (installs and locks dependencies)
uv sync

Running the Bot Locally

# Run directly with Python once the venv is active
python bot-gemini.py

# Or use uv to handle the interpreter automatically
uv run python bot-gemini.py

# Launch the local WebRTC client after the bot is running
open http://localhost:8080/client

Docker Workflow

# Build and load a linux/amd64 image (required while inferedge-moss ships amd64 wheels only)
docker buildx build --platform linux/amd64 -t moss-gemini-pipecat-bot --load .

# Rebuild from scratch without cache (still targeting linux/amd64)
docker buildx build --platform linux/amd64 -t moss-gemini-pipecat-bot --load --no-cache .

# Run the container (add --platform on Apple Silicon to enable Rosetta translation)
docker run --rm --platform linux/amd64 --name moss-gemini-bot -p 7860:7860 -e UVICORN_HOST=0.0.0.0 --env-file .env moss-gemini-pipecat-bot

# Convenience wrapper (checks for .env and image first)
> ℹ️ The current Moss SDK publishes binary wheels for linux/amd64 only. Building for linux/arm64 will fail until a matching `inferedge-moss-core` wheel is released. Use `--platform linux/amd64` whenever you build or run Docker images on Apple Silicon.

ℹ️ The current Moss SDK publishes binary wheels for linux/amd64 only. Building for linux/arm64 will fail until a matching inferedge-moss-core wheel is released. Use --platform linux/amd64 when building or running Docker images on Apple Silicon.

Testing & Linting

# Run the unit test suite
pytest

# (Optional) lint with Ruff if you installed dev dependencies
uv run ruff check .

Deployment

# Deploy to Google Cloud Run (builds, pushes, deploys)
./deploy.sh YOUR_GCP_PROJECT_ID

# Update Cloud Run env vars after deployment (example)
gcloud run services update moss-gemini-pipecat-bot \
    --region=us-central1 \
    --set-env-vars="DEEPGRAM_API_KEY=...,CARTESIA_API_KEY=...,GOOGLE_API_KEY=..."

🎤 What It Does

Voice Pipeline: Speech → Deepgram STT → Moss Context Retrieval → Gemini (Google Generative AI) → Cartesia TTS → Voice Response

  • Listens to customer questions via microphone
  • Searches your knowledge base using semantic similarity (not keywords)
  • Generates contextual AI responses with Gemini using retrieved context
  • Responds with natural speech synthesis

🤖 Available Bots

bot-gemini.py - Google Gemini powered assistant (primary entry point)

Both bots use the same Moss semantic search integration and voice pipeline.

🔍 Troubleshooting

Command not found: source .venv/bin/activate first
401 errors: Check API keys in .env
No audio: Allow browser microphone permissions
FFmpeg warning: Safe to ignore

📁 Structure

├── bot-gemini.py                               # Main app (only file you need to run!)
├── moss_context_retriever_gemini.py            # Moss semantic search integration for Gemini
├── .env                                        # Your API keys
└── pyproject.toml                              # Dependencies managed by uv

Tech Stack: Pipecat + Deepgram + Moss + Google Gemini + Cartesia + WebRTC


Built with Pipecat AI Framework • Ready for Moss integration