Skip to content

๐Ÿ“ž๐Ÿค– A SIP enabled voice-powered AI assistant that answers phone calls, understands natural language, and performs actions like checking weather, setting timers, scheduling callbacks, and more.

License

Notifications You must be signed in to change notification settings

CHA0S-CORP/general-disarray

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

88 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“žโšก General Dissarray

๐Ÿค– SIP Enabled AI Agent

๐Ÿค– ROBO CODED โ€” This project was made with AI and may not be 100% sane. But the code does work! ๐ŸŽ‰

A voice-powered AI assistant that answers phone calls, understands natural language, and performs actions like checking weather, setting timers, scheduling callbacks, and more.

License: AGPL v3 Version Docker Python 3.11+ Runs on DGX Spark Docs

Build Status Build Status

๐Ÿ“– Read the Documentation


โœจ Features

Feature Description
๐ŸŽ™๏ธ Voice Conversations Natural speech-to-text and text-to-speech powered by Whisper & Kokoro
๐Ÿค– LLM Integration Connects to OpenAI, vLLM, Ollama, LM Studio, and more
๐Ÿ”ง Built-in Tools Weather, timers, callbacks, date/time, calculator, jokes
๐Ÿ”Œ Plugin System Easily add custom tools with Python
๐ŸŒ REST API Initiate outbound calls, execute tools, manage schedules
โฐ Scheduled Calls One-time or recurring calls (daily briefings, reminders)
๐Ÿ”— Webhooks Trigger calls from Home Assistant, n8n, Grafana, and more
๐Ÿ—ฃ๏ธ Custom Phrases Customize greetings, goodbyes, and responses via JSON or env vars
๐Ÿ“Š Observability Prometheus metrics, OpenTelemetry tracing, structured JSON logs

๐Ÿ’ก Use Cases

Use Case Example
โฒ๏ธ Timers & Reminders "Set a timer for 10 minutes"
๐Ÿ“ž Callbacks "Call me back in an hour"
๐ŸŒค๏ธ Weather Briefings Scheduled morning weather calls
๐Ÿ“… Appointment Reminders Outbound calls with confirmation
๐Ÿšจ Alerts & Notifications Webhook-triggered phone calls
๐Ÿ  Smart Home Voice control via phone

๐Ÿš€ Quick Example

Call the assistant and say:

๐Ÿ—ฃ๏ธ "What's the weather like?"

sequenceDiagram
    participant User as ๐Ÿ‘ค User
    participant Agent as ๐Ÿค– SIP Agent
    participant STT as ๐ŸŽค Speaches
    participant LLM as ๐Ÿง  LLM
    participant Tool as ๐ŸŒค๏ธ Weather Tool
    
    User->>Agent: "What's the weather like?"
    Agent->>STT: Audio stream
    STT-->>Agent: Transcribed text
    Agent->>LLM: User query + context
    LLM-->>Agent: [TOOL:WEATHER]
    Agent->>Tool: Execute
    Tool-->>Agent: Weather data
    Agent->>LLM: Tool result
    LLM-->>Agent: Natural response
    Agent->>STT: Text to speech
    STT-->>Agent: Audio
    Agent->>User: "At Storm Lake, it's 44ยฐ..."
Loading

Assistant responds:

๐Ÿค– "At Storm Lake, as of 9:30 pm, it's 44 degrees with foggy conditions. Wind is calm."

Example conversation flow


๐Ÿ—๏ธ Architecture

flowchart LR
    subgraph Caller
        Phone[๐Ÿ“ฑ SIP Phone]
    end
    
    subgraph Agent["๐Ÿค– SIP AI Agent"]
        SIP[SIP Client]
        Audio[Audio Pipeline]
        Tools[Tool Manager]
        API[REST API]
    end
    
    subgraph Services
        LLM[๐Ÿง  LLM Server<br/>OpenAI / vLLM / Ollama]
        Speaches[๐ŸŽค Speaches<br/>STT + TTS]
    end
    
    subgraph Integrations
        HA[๐Ÿ  Home Assistant]
        N8N[๐Ÿ”„ n8n]
        Webhook[๐Ÿ”— Webhooks]
    end
    
    Phone <-->|SIP/RTP| SIP
    SIP <--> Audio
    Audio <-->|Whisper| Speaches
    Audio <-->|Kokoro| Speaches
    Audio <--> Tools
    Tools <-->|OpenAI API| LLM
    
    API <--> Tools
    HA -->|HTTP| API
    N8N -->|HTTP| API
    Webhook -->|HTTP| API
Loading

๐Ÿ”— Services & Integrations

Service Purpose URL
๐Ÿค– SIP Agent AI Voice Assistant API localhost:8080
๐ŸŽค Speaches STT/TTS (Whisper + Kokoro) localhost:8001
๐Ÿง  vLLM LLM Inference localhost:8000
๐Ÿ”ด Redis Call Queue & Cache redis://localhost:6379
๐Ÿ“Š Prometheus Metrics Collection localhost:9090
๐Ÿ“ˆ Grafana Dashboards localhost:3000
๐Ÿ“ Loki Log Aggregation localhost:3100
๐Ÿ” Tempo Distributed Tracing localhost:3200
๐Ÿ”„ n8n Workflow Automation localhost:5678

๐Ÿš€ Quick Start

Prerequisites

Requirement Description
๐Ÿณ Docker Docker and Docker Compose
๐Ÿ“ž SIP Server FreePBX, Asterisk, 3CX, or any SIP PBX

Installation

# Clone the repository
git clone https://github.com/your-org/sip-agent.git
cd sip-agent

# Configure environment
cp sip-agent/.env.example sip-agent/.env
nano sip-agent/.env

# Start services
docker compose up -d

# (Optional) Start services with Observability
docker compose -f ./docker-compose.yml -f docker-compose.observability.yml up -d

Verify Installation

curl http://localhost:8080/health | jq

Expected output:

{
  "status": "healthy",
  "sip_registered": true,
  "active_calls": 0
}

Make a Test Call

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐Ÿ“ž INCOMING CALL                                           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ ๐Ÿค– "Hello! Welcome to the AI assistant. How can I help?"  โ”‚
โ”‚ ๐Ÿ‘ค "What's the weather like?"                              โ”‚
โ”‚ ๐Ÿค– "At Storm Lake, it's 44 degrees with foggy conditions."โ”‚
โ”‚ ๐Ÿ‘ค "Set a timer for 5 minutes"                             โ”‚
โ”‚ ๐Ÿค– "Timer set for 5 minutes!"                             โ”‚
โ”‚ ๐Ÿ‘ค "Goodbye"                                               โ”‚
โ”‚ ๐Ÿค– "Goodbye! Have a great day!"                           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โš™๏ธ Configuration

Create a .env file with your settings:

# ๐Ÿ“ž SIP Connection
SIP_USER=ai-assistant
SIP_PASSWORD=your-secure-password
SIP_DOMAIN=pbx.example.com

# ๐ŸŽค Speaches (STT + TTS)
SPEACHES_API_URL=http://speaches:8001

# ๐Ÿง  LLM Settings
LLM_BASE_URL=http://vllm:8000/v1
LLM_MODEL=openai-community/gpt2-xl

# ๐ŸŒค๏ธ Weather (Optional)
TEMPEST_STATION_ID=12345
TEMPEST_API_TOKEN=your-api-token

๐Ÿ“– See Configuration Reference for all options.


๐ŸŒ API Examples

๐Ÿ“ž Make an Outbound Call

curl -X POST http://localhost:8080/call \
  -H "Content-Type: application/json" \
  -d '{
    "extension": "5551234567",
    "message": "Hello! This is a reminder about your appointment tomorrow."
  }'

Response:

{
  "call_id": "out-1732945860-1",
  "status": "queued",
  "message": "Call initiated"
}

๐ŸŒ… Morning Weather Briefing

Schedule a daily weather call at 7am:

curl -X POST http://sip-agent:8080/schedule \
  -H "Content-Type: application/json" \
  -d '{
    "extension": "5551234567",
    "tool": "WEATHER",
    "at_time": "07:00",
    "timezone": "America/Los_Angeles",
    "recurring": "daily",
    "prefix": "Good morning! Here is your weather update for today.",
    "suffix": "Have a great day!"
  }' | jq

Response:

{
  "schedule_id": "a1b2c3d4",
  "status": "scheduled",
  "scheduled_for": "2025-12-01T07:00:00-08:00",
  "recurring": "daily"
}

๐Ÿ”ง List Available Tools

curl http://localhost:8080/tools | jq '.[].name'

Output:

"WEATHER"
"SET_TIMER"
"CALLBACK"
"HANGUP"
"STATUS"
"CANCEL"
"DATETIME"
"CALC"
"JOKE"

๐Ÿง  Recommended Models

NVIDIA H100 / A100 (80GB HBM)

Data center GPUs with maximum performance.

Component Model Notes
LLM meta-llama/Llama-3.1-70B-Instruct Best quality, fits in single GPU
LLM Qwen/Qwen2.5-72B-Instruct Alternative, excellent reasoning
STT Systran/faster-whisper-large-v3 Best accuracy
TTS af_heart Warm, natural voice
# H100/A100 80GB Configuration
LLM_MODEL=meta-llama/Llama-3.1-70B-Instruct
LLM_URL=http://localhost:8000/v1
STT_MODEL=Systran/faster-whisper-large-v3
TTS_VOICE=af_heart

NVIDIA DGX Spark (128GB Unified)

Grace Blackwell GB10 with shared CPU/GPU memory.

Component Model Notes
LLM meta-llama/Llama-3.1-70B-Instruct Fits in unified memory
LLM Qwen/Qwen2.5-72B-Instruct Alternative option
LLM deepseek-ai/DeepSeek-R1-Distill-Llama-70B Reasoning focused
STT Systran/faster-whisper-large-v3 Best accuracy
TTS af_heart Warm, natural voice
# DGX Spark Configuration (128GB unified memory)
LLM_MODEL=meta-llama/Llama-3.1-70B-Instruct
LLM_URL=http://localhost:8000/v1
STT_MODEL=Systran/faster-whisper-large-v3
TTS_VOICE=af_heart

NVIDIA RTX 5090 (32GB GDDR7)

Next-gen consumer flagship.

Component Model Notes
LLM Qwen/Qwen2.5-32B-Instruct Best fit for 32GB
LLM meta-llama/Llama-3.1-8B-Instruct Faster, lower quality
LLM mistralai/Mistral-Small-24B-Instruct-2501 Good balance
STT Systran/faster-whisper-large-v3 Best accuracy
TTS af_heart Warm, natural voice
# RTX 5090 Configuration (32GB VRAM)
LLM_MODEL=Qwen/Qwen2.5-32B-Instruct
LLM_URL=http://localhost:8000/v1
STT_MODEL=Systran/faster-whisper-large-v3
TTS_VOICE=af_heart

NVIDIA RTX 4090 (24GB GDDR6X)

Current consumer flagship.

Component Model Notes
LLM Qwen/Qwen2.5-14B-Instruct Best quality for 24GB
LLM meta-llama/Llama-3.1-8B-Instruct Faster option
LLM mistralai/Mistral-7B-Instruct-v0.3 Good tool calling
STT Systran/faster-whisper-large-v3 Best accuracy
TTS af_heart Warm, natural voice
# RTX 4090 Configuration (24GB VRAM)
LLM_MODEL=Qwen/Qwen2.5-14B-Instruct
LLM_URL=http://localhost:8000/v1
STT_MODEL=Systran/faster-whisper-large-v3
TTS_VOICE=af_heart

NVIDIA RTX 3090 / 4080 (24GB / 16GB)

High-end consumer GPUs.

Component Model Notes
LLM meta-llama/Llama-3.1-8B-Instruct Best for 16-24GB
LLM Qwen/Qwen2.5-7B-Instruct Fast alternative
LLM microsoft/Phi-3-medium-4k-instruct 14B, good quality
STT Systran/faster-whisper-medium Good balance
TTS af_heart Warm, natural voice
# RTX 3090/4080 Configuration (16-24GB VRAM)
LLM_MODEL=meta-llama/Llama-3.1-8B-Instruct
LLM_URL=http://localhost:8000/v1
STT_MODEL=Systran/faster-whisper-medium
TTS_VOICE=af_heart

NVIDIA RTX 3080 / 4070 (10-12GB)

Mid-range GPUs.

Component Model Notes
LLM Qwen/Qwen2.5-7B-Instruct Best for 10-12GB
LLM microsoft/Phi-3-mini-4k-instruct 3.8B, very fast
LLM meta-llama/Llama-3.2-3B-Instruct Lightweight
STT Systran/faster-whisper-small Low VRAM
TTS af_heart Warm, natural voice
# RTX 3080/4070 Configuration (10-12GB VRAM)
LLM_MODEL=Qwen/Qwen2.5-7B-Instruct
LLM_URL=http://localhost:8000/v1
STT_MODEL=Systran/faster-whisper-small
TTS_VOICE=af_heart

Low-Latency Stack (Any GPU)

Optimized for fastest response times.

# Minimum latency configuration
LLM_MODEL=Qwen/Qwen2.5-3B-Instruct
STT_MODEL=Systran/faster-whisper-tiny.en
TTS_VOICE=af_heart
TTS_SPEED=1.1

TTS Voice Options

Voice Style Gender Accent
af_heart Warm, friendly Female American
af_bella Professional Female American
af_sarah Casual Female American
af_nicole Expressive Female American
am_adam Neutral Male American
am_michael Professional Male American
bf_emma Warm Female British
bm_george Professional Male British

๐Ÿ”ง Built-in Tools

Tool Description Example Phrase
๐ŸŒค๏ธ WEATHER Current weather conditions "What's the weather?"
โฒ๏ธ SET_TIMER Set a countdown timer "Set a timer for 5 minutes"
๐Ÿ“ž CALLBACK Schedule a callback "Call me back in an hour"
๐Ÿ“ด HANGUP End the call "Goodbye"
๐Ÿ“‹ STATUS Check pending timers "What timers do I have?"
โŒ CANCEL Cancel timers/callbacks "Cancel my timer"
๐Ÿ• DATETIME Current date and time "What time is it?"
๐Ÿงฎ CALC Math calculations "What's 25 times 4?"
๐Ÿ˜„ JOKE Tell a joke "Tell me a joke"
๐Ÿฆœ SIMON_SAYS Repeat back verbatim "Simon says hello world"

๐Ÿ”Œ Creating Plugins

Add custom tools by creating Python plugins:

# src/plugins/hello_tool.py
from tool_plugins import BaseTool, ToolResult, ToolStatus

class HelloTool(BaseTool):
    name = "HELLO"
    description = "Say hello to someone"
    
    parameters = {
        "name": {
            "type": "string",
            "description": "Name to greet",
            "required": True
        }
    }
    
    async def execute(self, params):
        name = params.get("name", "friend")
        return ToolResult(
            status=ToolStatus.SUCCESS,
            message=f"Hello, {name}! Nice to meet you."
        )

Register in tool_manager.py:

from plugins.hello_tool import HelloTool

tool_classes = [
    # ... existing tools ...
    HelloTool,
]

๐Ÿ“– See Creating Plugins for the full guide.


๐Ÿ“Š Monitoring

View Logs

# Docker logs
docker logs -f sip-agent

# Formatted log viewer
python tools/view-logs.py -f

Example output:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
โ”‚ ๐Ÿ“ž CALL #1 - From: 1001
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
15:30:05  ๐Ÿ“ž Call started
15:30:06  ๐Ÿ‘ค "What's the weather?"
15:30:07  ๐Ÿ”ง [TOOL:WEATHER]
15:30:08  ๐Ÿค– "At Storm Lake, it's 44 degrees..."
15:30:12  ๐Ÿ‘ค "Thanks, goodbye"
15:30:13  ๐Ÿ“ด Call ended (duration: 0:08)

Grafana Dashboard

Import the included dashboard:

grafana/dashboards/sip-agent.json

Alt text Alt text Alt text Alt text Alt text


๐Ÿ—‚๏ธ Project Structure

sip-agent/
โ”œโ”€โ”€ ๐Ÿ“„ README.md                    # ๐Ÿ‘ˆ You are here
โ”œโ”€โ”€ ๐Ÿ“„ RELEASE.md                   # Release notes
โ”œโ”€โ”€ ๐Ÿ“„ CHANGELOG.md                 # Version history
โ”œโ”€โ”€ ๐Ÿ“„ docker-compose.yml           # Main compose file
โ”œโ”€โ”€ ๐Ÿ“„ docker-compose.observability.yml
โ”œโ”€โ”€ ๐Ÿ“„ openapi.yaml                 # API specification
โ”‚
โ”œโ”€โ”€ ๐Ÿ“‚ sip-agent/                   # Core application
โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ Dockerfile
โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ requirements.txt
โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ .env.example
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ data/
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“„ phrases.json.example
โ”‚   โ””โ”€โ”€ ๐Ÿ“‚ src/
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ main.py              # Application entry
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ config.py            # Configuration
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ api.py               # REST API
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ sip_handler.py       # SIP call handling
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ audio_pipeline.py    # STT/TTS processing
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ llm_engine.py        # LLM integration
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ tool_manager.py      # Tool orchestration
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ tool_plugins.py      # Plugin base classes
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ call_queue.py        # Redis call queue
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ realtime_client.py   # WebSocket STT
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ telemetry.py         # OpenTelemetry
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ logging_utils.py     # Structured logging
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ retry_utils.py       # API retry logic
โ”‚       โ””โ”€โ”€ ๐Ÿ“‚ plugins/             # Built-in tools
โ”‚           โ”œโ”€โ”€ ๐Ÿ“„ weather_tool.py
โ”‚           โ”œโ”€โ”€ ๐Ÿ“„ timer_tool.py
โ”‚           โ”œโ”€โ”€ ๐Ÿ“„ callback_tool.py
โ”‚           โ”œโ”€โ”€ ๐Ÿ“„ hangup_tool.py
โ”‚           โ”œโ”€โ”€ ๐Ÿ“„ status_tool.py
โ”‚           โ”œโ”€โ”€ ๐Ÿ“„ cancel_tool.py
โ”‚           โ”œโ”€โ”€ ๐Ÿ“„ datetime_tool.py
โ”‚           โ”œโ”€โ”€ ๐Ÿ“„ calc_tool.py
โ”‚           โ”œโ”€โ”€ ๐Ÿ“„ joke_tool.py
โ”‚           โ””โ”€โ”€ ๐Ÿ“„ simon_says_tool.py
โ”‚
โ”œโ”€โ”€ ๐Ÿ“‚ docs/                        # Documentation
โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ index.md                 # Overview
โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ getting-started.md       # Installation
โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ configuration.md         # Config reference
โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ api-reference.md         # REST API
โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ tools.md                 # Built-in tools
โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ plugins.md               # Plugin development
โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ examples.md              # Integration examples
โ”‚   โ””โ”€โ”€ ๐Ÿ“‚ screenshots/
โ”‚
โ”œโ”€โ”€ ๐Ÿ“‚ observability/               # Monitoring stack
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ grafana/
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“‚ provisioning/
โ”‚   โ”‚       โ”œโ”€โ”€ ๐Ÿ“‚ dashboards/      # Pre-built dashboards
โ”‚   โ”‚       โ””โ”€โ”€ ๐Ÿ“‚ datasources/
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ prometheus/
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“„ prometheus.yaml
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ loki/
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“„ loki.yaml
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ tempo/
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“„ tempo.yaml
โ”‚   โ””โ”€โ”€ ๐Ÿ“‚ otel-collector/
โ”‚       โ””โ”€โ”€ ๐Ÿ“„ config.yaml
โ”‚
โ”œโ”€โ”€ ๐Ÿ“‚ tools/                       # Utilities
โ”‚   โ””โ”€โ”€ ๐Ÿ“„ view-logs.py             # Log viewer
โ”‚
โ””โ”€โ”€ ๐Ÿ“‚ .github/
    โ””โ”€โ”€ ๐Ÿ“‚ workflows/
        โ”œโ”€โ”€ ๐Ÿ“„ docker-build.yml     # Docker CI
        โ””โ”€โ”€ ๐Ÿ“„ readme-sync.yml      # Docs sync

๐Ÿ–ฅ๏ธ Runs on NVIDIA DGX Spark

This project is optimized to run on the NVIDIA DGX Spark with Grace Blackwell architecture.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐ŸŸข NVIDIA DGX Spark                                         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ ๐Ÿง  Grace Blackwell GB10 Superchip                          โ”‚
โ”‚ ๐Ÿ’พ 128GB Unified Memory                                     โ”‚
โ”‚ โšก 1 PFLOP AI Performance                                   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ โœ… Local LLM inference (vLLM, Ollama)                      โ”‚
โ”‚ โœ… Local STT/TTS (Speaches + Whisper + Kokoro)             โ”‚
โ”‚ โœ… Real-time voice processing                               โ”‚
โ”‚ โœ… Multiple concurrent calls                                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recommended DGX Spark setup:

# Run everything locally on DGX Spark
LLM_BASE_URL=http://localhost:8000/v1
LLM_MODEL=openai/gpt-oss-20b
SPEACHES_API_URL=http://localhost:8001

๐Ÿ“– Documentation

๐Ÿ“š Full documentation available at sip-agent.readme.io

Document Description
๐Ÿ“– Overview Architecture and features
๐Ÿš€ Getting Started Installation guide
โš™๏ธ Configuration Environment variables
๐ŸŒ API Reference REST API endpoints
๐Ÿ”ง Built-in Tools Available tools
๐Ÿ”Œ Creating Plugins Custom tool development
๐Ÿ“– Examples Integration patterns

๐Ÿค Contributing

Contributions are welcome! Please read our contributing guidelines first.

# Fork and clone
git clone https://github.com/your-username/sip-agent.git

# Create branch
git checkout -b feature/amazing-feature

# Make changes and test
docker compose up -d
python -m pytest

# Commit with emoji
git commit -m "โœจ feat: add amazing feature"

# Push and PR
git push origin feature/amazing-feature

๐Ÿ“œ License

This project is licensed under the GNU Affero General Public License v3.0 - see the LICENSE file for details.

SPDX-License-Identifier: AGPL-3.0-or-later

๐Ÿ™ Acknowledgments


๐Ÿ“ž Support

Resource Link
๐Ÿ“– Docs sip-agent.readme.io
๐Ÿ› Issues GitHub Issues
๐Ÿ’ฌ Discussions GitHub Discussions

Made with โค๏ธ and ๐Ÿค–

About

๐Ÿ“ž๐Ÿค– A SIP enabled voice-powered AI assistant that answers phone calls, understands natural language, and performs actions like checking weather, setting timers, scheduling callbacks, and more.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages