Skip to content

Commit bf18e64

Browse files
committed
Create create-a-voice-virtual-assistant.mdx
1 parent a171a2d commit bf18e64

File tree

1 file changed

+262
-0
lines changed

1 file changed

+262
-0
lines changed
Lines changed: 262 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,262 @@
1+
---
2+
title: Create a Voice Virtual Assistant
3+
author: Alexandre Sajus
4+
uid:
5+
datePublished:
6+
published: false
7+
description: Learn how to build a Voice Virtual Assistant using ElevenLabs API and Python for seamless AI-powered interactions.
8+
header:
9+
tags:
10+
- Intermediate
11+
- Python
12+
- AI
13+
---
14+
15+
<BannerImage
16+
link=""
17+
description="Title Image"
18+
uid={true}
19+
cl="for-sidebar"
20+
/>
21+
22+
# Build a Conversational Pong Game in p5.js
23+
24+
<AuthorAvatar
25+
author_name="Alexandre Sajus"
26+
author_avatar="https://i.imgur.com/fhkMdV4.jpeg"
27+
username=""
28+
uid={true}
29+
/>
30+
31+
<BannerImage
32+
link=""
33+
description="Banner"
34+
uid={true}
35+
/>
36+
37+
**Prerequisites:** Python
38+
**Versions:** Python 3.11, python-dotenv 1.0.1, elevenlabs 1.54.0, elevenlabs[pyaudio]
39+
**Read Time:** 60 minutes
40+
41+
## Introduction
42+
43+
Voice assistants like Siri, Google Assistant, and Alexa have revolutionized the way we interact with technology. In this tutorial, we’ll learn how to create a Voice Virtual Assistant using [ElevenLabs](https://elevenlabs.io/) API and Python. This assistant will be able to engage in natural conversations and provide helpful responses in real time.
44+
45+
The final assistant will:
46+
47+
- Process user voice and text input
48+
- Use ElevenLabs API for voice synthesis
49+
- Provide a seamless conversational experience
50+
51+
Here is a sneak peek of the final assistant in action:
52+
53+
<iframe
54+
width="1280"
55+
align="center"
56+
height="480"
57+
src="https://i.imgur.com/83XuIPB.mp4"
58+
title=""
59+
allowFullScreen
60+
></iframe>
61+
62+
Let's dive in!
63+
64+
## Setting Up the Environment
65+
66+
### 1. Install Required Packages
67+
68+
Before we start, make sure you have Python installed. Then, install the required dependencies:
69+
70+
```sh
71+
pip install elevenlabs elevenlabs[pyaudio] python-dotenv
72+
```
73+
74+
### 2. Setting up ElevenLabs
75+
76+
ElevenLabs provides a Conversational AI API that we will use to create our Voice Assistant. - The API records the user's voice through the microphone
77+
- It processes it to know when the user has finished speaking or is interrupting the assistant
78+
- It calls an LLM model to generate a response
79+
- It synthesizes the response into speech
80+
- It plays the synthesized speech through the speakers
81+
82+
<p align="center">
83+
<img src="https://i.imgur.com/TZpJsMK.png" alt="ElevenLabs Function Diagram" width="80%"/>
84+
</p>
85+
86+
1. Sign up at [ElevenLabs](https://elevenlabs.io/app/sign-up) and follow the instructions to create an account.
87+
88+
2. Once signed in, go to "Conversational AI"
89+
90+
<p align="center">
91+
<img src="https://i.imgur.com/bXHHn0E.png" alt="Conversational AI button on dashboard" width="80%"/>
92+
</p>
93+
94+
3. Go to "Agents"
95+
96+
<p align="center">
97+
<img src="https://i.imgur.com/Hdzof0l.png" alt="Agents button on dashboard" width="80%"/>
98+
</p>
99+
100+
4. Click on "Start from blank"
101+
102+
<p align="center">
103+
<img src="https://i.imgur.com/uAMmwqB.png" alt="Start from blank button" width="80%"/>
104+
</p>
105+
106+
5. Create a ".env" file at the root of your project folder. We will use this file to store our API credentials securely. This way they won't be hardcoded in the script. In this ".env" file, add your Agent ID:
107+
108+
<p align="center">
109+
<img src="https://i.imgur.com/WEfcWZK.png" alt="Agent ID" width="80%"/>
110+
</p>
111+
112+
```bash
113+
AGENT_ID=your_agent_id
114+
```
115+
116+
6. Go to the "Security" tab, enable the "First message" and "System prompt" overrides, and save. This will allow us to customize the assistant's first message and system prompt using Python code.
117+
118+
<p align="center">
119+
<img src="https://i.imgur.com/ugDFbs1.png" alt="Security tab" width="80%"/>
120+
</p>
121+
122+
7. Click on your profile and go to "API keys". Create a new API key and copy it to your ".env" file:
123+
124+
```bash
125+
API_KEY="sk_XXX...XXX"
126+
```
127+
128+
<p align="center">
129+
<img src="https://i.imgur.com/lfPxnL8.png" alt="API keys" width="80%"/>
130+
</p>
131+
132+
ElevenLabs is now set up and ready to be used in our Python script!
133+
134+
Note: ElevenLabs works with a credit system. When you sign up, you get 10,000 free credits which amount to 15 minutes of conversation. You can buy more credits if needed.
135+
136+
137+
## Building the Voice Assistant
138+
139+
### 1. Load Environment Variables
140+
141+
Create a Python file (e.g., `voice_assistant.py`) and load your API credentials:
142+
143+
```python
144+
import os
145+
from dotenv import load_dotenv
146+
147+
load_dotenv()
148+
149+
AGENT_ID = os.getenv("AGENT_ID")
150+
API_KEY = os.getenv("API_KEY")
151+
```
152+
153+
### 2. Configure ElevenLabs Conversation API
154+
155+
We will set up the ElevenLabs client and configure a conversation instance.
156+
157+
We'll start by importing the necessary modules:
158+
159+
```python
160+
from elevenlabs.client import ElevenLabs
161+
from elevenlabs.conversational_ai.conversation import Conversation
162+
from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface
163+
from elevenlabs.types import ConversationConfig
164+
```
165+
166+
We will then configure the conversation with the agent's first message and system prompt. We are going to inform the assistant that the user has a schedule and prompt it to help the user. In this part you can customize:
167+
- The user's name: what the assistant will call the user
168+
- The schedule: the user's schedule that the assistant will use to provide help
169+
- The prompt: the message that the assistant will receive when the conversation starts to understand the context of the conversation
170+
- The first message: the first message the assistant will say to the user
171+
172+
```python
173+
user_name = "Alex"
174+
schedule = "Sales Meeting with Taipy at 10:00; Gym with Sophie at 17:00"
175+
prompt = f"You are a helpful assistant. Your interlocutor has the following schedule: {schedule}."
176+
first_message = f"Hello {user_name}, how can I help you today?"
177+
```
178+
179+
We are then going to set this configuration to our ElevenLabs agent:
180+
181+
```python
182+
conversation_override = {
183+
"agent": {
184+
"prompt": {
185+
"prompt": prompt,
186+
},
187+
"first_message": first_message,
188+
},
189+
}
190+
191+
config = ConversationConfig(
192+
conversation_config_override=conversation_override,
193+
extra_body={},
194+
dynamic_variables={},
195+
)
196+
197+
client = ElevenLabs(api_key=API_KEY, timeout=15)
198+
conversation = Conversation(
199+
client,
200+
AGENT_ID,
201+
config=config,
202+
requires_auth=True,
203+
audio_interface=DefaultAudioInterface(),
204+
)
205+
```
206+
207+
### 3. Implement Callbacks for Responses
208+
209+
To improve user experience, define callback functions to handle assistant responses. These functions will print the assistant's responses and user transcripts. We also define a function to handle the situation where the user interrupts the assistant:
210+
211+
```python
212+
def print_agent_response(response):
213+
print(f"Agent: {response}")
214+
215+
def print_interrupted_response(original, corrected):
216+
print(f"Agent interrupted, truncated response: {corrected}")
217+
218+
def print_user_transcript(transcript):
219+
print(f"User: {transcript}")
220+
```
221+
222+
### 4. Start the Voice Assistant Session
223+
224+
Finally, initiate the conversation session:
225+
226+
```python
227+
conversation = Conversation(
228+
client,
229+
AGENT_ID,
230+
config=config,
231+
requires_auth=True,
232+
audio_interface=DefaultAudioInterface(),
233+
callback_agent_response=print_agent_response,
234+
callback_agent_response_correction=print_interrupted_response,
235+
callback_user_transcript=print_user_transcript,
236+
)
237+
238+
conversation.start_session()
239+
```
240+
241+
## Running the Assistant
242+
243+
Execute the script:
244+
245+
```bash
246+
python voice_assistant.py
247+
```
248+
249+
The assistant will start listening for input and responding in real time!
250+
251+
## Conclusion
252+
253+
Congratulations! 🎉 You've successfully built a Voice Virtual Assistant using ElevenLabs API. You can extend its capabilities by integrating it with home automation, calendars, or other APIs to make it even more useful.
254+
255+
Stay creative and keep experimenting with AI-powered assistants!
256+
257+
## More Resources
258+
259+
- ElevenLabs Conversational AI [Overview](https://elevenlabs.io/docs/conversational-ai/overview)
260+
- ElevenLabs Python SDK [Documentation](https://elevenlabs.io/docs/conversational-ai/libraries/python)
261+
- Enable your assistant to execute Python functions with [Client Tools](https://elevenlabs.io/docs/conversational-ai/customization/tools-events/client-tools)
262+
- Provide documents as context to your assistant with [RAG](https://elevenlabs.io/docs/conversational-ai/customization/knowledge-base/rag)

0 commit comments

Comments
 (0)