Skip to content

Commit 279a36e

Browse files
authored
Merge pull request anthropics#261 from anthropics/adriaan/elevenlabs-voice-assistant
Add ElevenLabs Low Latency Voice Assistant Integration
2 parents 001e5ca + 8148beb commit 279a36e

File tree

5 files changed

+1149
-0
lines changed

5 files changed

+1149
-0
lines changed
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# ElevenLabs API Key
2+
# Get your API key from: https://elevenlabs.io/app/developers/api-keys
3+
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
4+
5+
# Anthropic API Key
6+
# Get your API key from: https://console.anthropic.com/settings/keys
7+
ANTHROPIC_API_KEY=your_anthropic_api_key_here

third_party/ElevenLabs/README.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# ElevenLabs <> Claude Cookbooks
2+
3+
[ElevenLabs](https://elevenlabs.io/) provides AI-powered speech-to-text and text-to-speech APIs for creating natural-sounding voice applications with advanced features like voice cloning and streaming synthesis.
4+
5+
This cookbook demonstrates how to build a low-latency voice assistant by combining ElevenLabs' speech processing with Claude's intelligent responses, progressively optimizing for real-time performance.
6+
7+
## What's Included
8+
9+
* **[Low Latency Voice Assistant Notebook](./low_latency_stt_claude_tts.ipynb)** - An interactive tutorial that walks you through building a voice assistant step-by-step, demonstrating various optimization techniques to minimize latency through streaming.
10+
11+
* **[WebSocket Streaming Script](./stream_voice_assistant_websocket.py)** - A production-ready conversational voice assistant featuring continuous microphone input, gapless audio playback, and the lowest possible latency using WebSocket streaming.
12+
13+
## How to Use This Cookbook
14+
15+
We recommend following this sequence to get the most out of this cookbook:
16+
17+
### Step 1: Set Up Your Environment
18+
19+
1. **Get your API keys:**
20+
- ElevenLabs API key: [elevenlabs.io/app/developers/api-keys](https://elevenlabs.io/app/developers/api-keys)
21+
- Anthropic API key: [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys)
22+
23+
2. **Configure your environment:**
24+
```bash
25+
cp .env.example .env
26+
# Edit .env and add your API keys
27+
```
28+
29+
3. **Install dependencies:**
30+
```bash
31+
pip install -r requirements.txt
32+
```
33+
34+
### Step 2: Work Through the Notebook
35+
36+
Start with the **[Low Latency Voice Assistant Notebook](./low_latency_stt_claude_tts.ipynb)**. This interactive guide will teach you:
37+
38+
- How to use ElevenLabs for speech-to-text transcription
39+
- How to generate Claude responses and measure latency
40+
- How streaming reduces time-to-first-token
41+
- How to stream text-to-speech for faster audio playback
42+
- The tradeoffs between different streaming approaches
43+
- Why WebSocket streaming provides the best balance of latency and quality
44+
45+
The notebook includes performance metrics and comparisons at each step, helping you understand the impact of each optimization.
46+
47+
### Step 3: Try the Production Script
48+
49+
After understanding the concepts from the notebook, run the **[WebSocket Streaming Script](./stream_voice_assistant_websocket.py)** to experience a fully functional voice assistant:
50+
51+
```bash
52+
python stream_voice_assistant_websocket.py
53+
```
54+
55+
**How it works:**
56+
1. Press Enter to start recording
57+
2. Speak your question into the microphone
58+
3. Press Enter to stop recording
59+
4. The assistant will respond with natural speech
60+
5. Repeat or press Ctrl+C to exit
61+
62+
The script demonstrates production-ready implementations of:
63+
- Real-time microphone recording with sounddevice
64+
- Continuous conversation with context retention
65+
- WebSocket-based streaming for minimal latency
66+
- Custom audio queue for seamless playback
67+
68+
## More About ElevenLabs
69+
70+
Here are some helpful resources to deepen your understanding:
71+
72+
- [ElevenLabs Platform](https://elevenlabs.io/) - Official website
73+
- [API Documentation](https://elevenlabs.io/docs/overview) - Complete API reference
74+
- [Voice Library](https://elevenlabs.io/voice-library) - Explore available voices
75+
- [API Playground](https://elevenlabs.io/app/speech-synthesis/text-to-speech) - Test voices interactively
76+
- [Python SDK](https://github.com/elevenlabs/elevenlabs-python) - Official Python SDK

0 commit comments

Comments
 (0)