Name	Name	Last commit message	Last commit date
parent directory ..
src/VoiceServer	src/VoiceServer
INSTALL.md	INSTALL.md
README.md	README.md
VERIFY.md	VERIFY.md

name

PAI Voice System

pack-id

danielmiessler-voice-system-core-v2.3.0

version

2.3.0

author

danielmiessler

description

Text-to-speech notification server using ElevenLabs API with fallback to macOS say - gives your AI agent a voice

type

feature

purpose-type

productivity

automation

integration

platform

claude-code

dependencies

keywords

voice

tts

speech

notifications

elevenlabs

audio

alerts

spoken

feedback

macos

agent-voice

PAI Voice System

Text-to-speech notification server that gives your AI agent a voice - using ElevenLabs AI voices or macOS say as fallback

Installation Prompt

You are receiving a PAI Pack - a modular upgrade for AI agent systems.

What is PAI? See: PAI Project Overview

What is a Pack? See: Pack System

This pack adds voice capabilities to your AI infrastructure. The PAI Voice System provides:

Spoken Feedback: Hear your AI speak completions, alerts, and notifications
Multi-Voice Support: Different voices for different agents (Architect, Engineer, Designer, etc.)
ElevenLabs Integration: High-quality AI voices with personality-tuned settings
Fallback System: Works without API key using macOS built-in voices
Zero Configuration: Install and go - speaks immediately

Core principle: Your AI should be heard, not just read.

No more missing important completions. No more checking the terminal constantly. Your AI speaks to you while you focus elsewhere.

Please follow the installation instructions in INSTALL.md to integrate this pack into your infrastructure.

What's Included

Component	File	Purpose
Voice Server	`src/VoiceServer/server.ts`	HTTP server handling TTS requests
Install Script	`src/VoiceServer/install.sh`	One-command installation
Start Script	`src/VoiceServer/start.sh`	Start the voice server
Stop Script	`src/VoiceServer/stop.sh`	Stop the voice server
Restart Script	`src/VoiceServer/restart.sh`	Restart the voice server
Status Script	`src/VoiceServer/status.sh`	Check server status
Uninstall Script	`src/VoiceServer/uninstall.sh`	Clean uninstallation
Voice Config	`src/VoiceServer/voices.json`	Agent voice personalities
Menu Bar	`src/VoiceServer/menubar/`	SwiftBar/BitBar integration

Summary:

Files created: 12+
Hooks registered: 0 (server-only pack)
Dependencies: Bun runtime, macOS, ElevenLabs API key (optional)

The Concept and/or Problem

AI agents work silently. They complete tasks, generate outputs, and produce results - all without any notification that might reach you when you are not staring at the terminal.

This creates real problems:

For Productivity:

You miss important completions while working elsewhere
Context-switching to check "is it done yet?" breaks flow
Long-running tasks complete without notice

For Multi-Agent Systems:

Multiple agents finish at different times
You cannot tell which agent completed which task
Background agents complete silently

For User Experience:

Text-only feedback feels cold and robotic
Important alerts blend into scrolling text
No emotional connection with your AI assistant

The Fundamental Problem:

AI agents are mute by default. They have no voice. Every output requires visual attention. In a world where we interact with AI constantly, this creates unnecessary friction and missed opportunities for natural communication.

The Solution

The PAI Voice System solves this through a dedicated HTTP server that converts text to speech on demand. Any system component can trigger voice output with a simple POST request.

Core Architecture:

┌─────────────────────────────────────────────────────────────┐
│                    PAI Voice System                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   Hook/Agent/Tool                                            │
│         │                                                    │
│         ▼                                                    │
│   POST http://localhost:8888/notify                          │
│   {                                                          │
│     "message": "Task completed successfully",                │
│     "voice_id": "bIHbv24MWmeRgasZH58o",                      │
│     "title": "Agent Name"                                    │
│   }                                                          │
│         │                                                    │
│         ▼                                                    │
│   ┌─────────────────┐                                        │
│   │  Voice Server   │  (port 8888)                           │
│   │                 │                                        │
│   │  1. Sanitize    │  Strip markdown, validate              │
│   │  2. TTS Call    │  ElevenLabs or macOS say               │
│   │  3. Play Audio  │  afplay with volume control            │
│   │  4. Notify      │  macOS notification center             │
│   └─────────────────┘                                        │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Key Features:

ElevenLabs Integration: High-quality AI voices with the eleven_turbo_v2_5 model
Voice Personalities: Each agent can have distinct voice settings (stability, similarity_boost)
macOS Fallback: Works without API key using built-in say command
Security: Input sanitization, rate limiting (10 req/min), CORS restrictions
LaunchAgent: Auto-starts on login, runs in background
Menu Bar: Optional visual indicator with quick controls

Design Principles:

Fire and Forget: Callers do not wait for voice to complete
Fail Gracefully: Server issues never block other systems
Local Only: Listens only on localhost for security
Zero Config: Works immediately with sensible defaults

What Makes This Different

This sounds similar to system TTS which also does text-to-speech. What makes this approach different?

The PAI Voice System is purpose-built for AI agent infrastructure. It provides agent-specific voice personalities, hook integration, and personality-tuned voice settings that make each agent sound distinct. Unlike generic TTS, this creates an immersive multi-agent experience where you can identify which agent is speaking.

Agent personalities with distinct voice characteristics.
HTTP API enables any component to speak.
Auto-start as macOS LaunchAgent service.
Menu bar indicator shows server status.

Configuration

Environment variables (add to ~/.env):

# Required for ElevenLabs voices (optional - falls back to macOS say)
ELEVENLABS_API_KEY=your_api_key_here

# Default voice ID (optional)
ELEVENLABS_VOICE_ID=bIHbv24MWmeRgasZH58o

# Server port (optional, defaults to 8888)
PORT=8888

Get a free ElevenLabs API key at elevenlabs.io (10,000 characters/month free).

Example Usage

Basic Notification

curl -X POST http://localhost:8888/notify \
  -H "Content-Type: application/json" \
  -d '{"message": "Task completed successfully"}'

Agent-Specific Voice

# Male voice (e.g., Engineer)
curl -X POST http://localhost:8888/notify \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Build completed with zero errors",
    "voice_id": "bIHbv24MWmeRgasZH58o",
    "title": "Engineer Agent"
  }'

# Female voice (e.g., Designer)
curl -X POST http://localhost:8888/notify \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Design review complete, two issues found",
    "voice_id": "MClEFoImJXBTgLwdLI5n",
    "title": "Designer Agent"
  }'

# Neutral voice (e.g., Researcher)
curl -X POST http://localhost:8888/notify \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Research synthesis ready for review",
    "voice_id": "M563YhMmA0S8vEYwkgYa",
    "title": "Research Agent"
  }'

Health Check

curl http://localhost:8888/health
# Returns: {"status":"healthy","port":8888,"voice_system":"ElevenLabs",...}

Voice IDs Reference

Type	Voice ID	Use Case
Male	`bIHbv24MWmeRgasZH58o`	Engineer, Architect, technical agents
Female	`MClEFoImJXBTgLwdLI5n`	Designer, Writer, creative agents
Neutral	`M563YhMmA0S8vEYwkgYa`	Researcher, Analyst, neutral roles

Find more voices at ElevenLabs Voice Library.

Customization

Recommended Customization

What to Customize: Voice personalities in voices.json

Why: Different agents should sound distinct - an enthusiastic intern vs. a measured architect

Process:

Open $PAI_DIR/VoiceServer/voices.json
Adjust stability (0.0-1.0): Lower = more expressive, Higher = more consistent
Adjust similarity_boost (0.0-1.0): Higher = closer to original voice
Save and restart: ./restart.sh

Expected Outcome: Each agent has a recognizable voice personality

Optional Customization

Customization	File	Impact
Default volume	voices.json (`default_volume`)	0.0-1.0 scale for all voices
Speaking rate	voices.json (`rate_multiplier`)	Speed of speech
Custom voices	voices.json	Add your own ElevenLabs voices

Credits

Original concept: Daniel Miessler - developed as part of PAI personal AI infrastructure
ElevenLabs: Text-to-speech API providing high-quality AI voices
SwiftBar/BitBar: Menu bar integration for macOS

Changelog

2.3.0 - 2026-01-14

Packaged for PAI v2.3 release
Simplified prosody system (removed emotional markers)
Updated documentation with generic voice IDs

1.5.0 - 2026-01-12

Removed prosody enhancement system for simplicity
Voice personalities provide sufficient variation

1.4.0 - 2025-12-09

Added volume control (default_volume in voices.json)
Calmer startup voice

1.0.0 - 2025-11-16

Initial release with ElevenLabs integration
Multi-voice support for agent personalities
macOS LaunchAgent for auto-start

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

PAI Voice System

Installation Prompt

What's Included

The Concept and/or Problem

The Solution

What Makes This Different

Configuration

Example Usage

Basic Notification

Agent-Specific Voice

Health Check

Voice IDs Reference

Customization

Recommended Customization

Optional Customization

Credits

Changelog

2.3.0 - 2026-01-14

1.5.0 - 2026-01-12

1.4.0 - 2025-12-09

1.0.0 - 2025-11-16

FilesExpand file tree

pai-voice-system

Directory actions

More options

Directory actions

More options

Latest commit

History

pai-voice-system

Folders and files

parent directory

README.md

PAI Voice System

Installation Prompt

What's Included

The Concept and/or Problem

The Solution

What Makes This Different

Configuration

Example Usage

Basic Notification

Agent-Specific Voice

Health Check

Voice IDs Reference

Customization

Recommended Customization

Optional Customization

Credits

Changelog

2.3.0 - 2026-01-14

1.5.0 - 2026-01-12

1.4.0 - 2025-12-09

1.0.0 - 2025-11-16