Skip to content

Commit 082a01a

Browse files
committed
Add Virtual Assistant Project Tutorial
1 parent 6a603c7 commit 082a01a

File tree

2 files changed

+100
-83
lines changed

2 files changed

+100
-83
lines changed
Lines changed: 100 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -1,143 +1,147 @@
11
---
2-
title: Create a Voice Virtual Assistant
2+
title: Create a Voice Virtual Assistant with ElevenLabs
33
author: Alexandre Sajus
4-
uid:
5-
datePublished:
6-
published: false
4+
uid: u2DitJXVOqWo18bLsyXNLcVtufC2
5+
datePublished: TODO
76
description: Learn how to build a Voice Virtual Assistant using ElevenLabs API and Python for seamless AI-powered interactions.
8-
header:
7+
published: false
8+
header: https://raw.githubusercontent.com/codedex-io/projects/main/projects/create-a-voice-virtual-assistant/header.gif
99
tags:
10-
- Intermediate
11-
- Python
10+
- intermediate
11+
- python
1212
- AI
1313
---
1414

15-
<BannerImage
16-
link=""
17-
description="Title Image"
18-
uid={true}
19-
cl="for-sidebar"
20-
/>
15+
<BannerImage link="https://raw.githubusercontent.com/codedex-io/projects/main/projects/create-a-voice-virtual-assistant/header.gif" description="Title Image" uid={true} cl="for-sidebar"/>
2116

22-
# Build a Conversational Pong Game in p5.js
17+
# Create a Voice Virtual Assistant
2318

2419
<AuthorAvatar
25-
author_name="Alexandre Sajus"
26-
author_avatar="https://i.imgur.com/fhkMdV4.jpeg"
27-
username=""
28-
uid={true}
20+
author_name="Alexandre Sajus"
21+
author_avatar="/media/Alexandre.jpg"
22+
username="AlexandreSajus"
23+
uid={true}
2924
/>
3025

31-
<BannerImage
32-
link=""
33-
description="Banner"
34-
uid={true}
35-
/>
26+
<BannerImage link="https://raw.githubusercontent.com/codedex-io/projects/main/projects/create-a-voice-virtual-assistant/header.gif" description="Title Image" uid={true}/>
3627

37-
**Prerequisites:** Python
28+
**Prerequisites:** Python fundamentals, API usage
3829
**Versions:** Python 3.11, python-dotenv 1.0.1, elevenlabs 1.54.0, elevenlabs[pyaudio]
39-
**Read Time:** 60 minutes
30+
**Read Time:** 60 minutes
4031

4132
## Introduction
4233

43-
Voice assistants like Siri, Google Assistant, and Alexa have revolutionized the way we interact with technology. In this tutorial, we’ll learn how to create a Voice Virtual Assistant using [ElevenLabs](https://elevenlabs.io/) API and Python. This assistant will be able to engage in natural conversations and provide helpful responses in real time.
34+
Hey!👋 I'm Alex! Since this is my first tutorial on Codédex, I wanted to revisit a very popular project I made in 2023: creating a voice virtual assistant using Python.
35+
36+
Back in 2023, I had just graduated with a Master's of Science in Artificial Intelligence and I had already made a few fun AI projects like detecting vehicles in video games using object detection or teaching 3D characters to jump over obstacles using reinforcement learning (You can check them out on my [YouTube channel](https://www.youtube.com/@alexandresajus/videos)).
37+
38+
<ImageZoom src="https://i.imgur.com/W5fyBFH.png" style={{width: "60%", height: "auto"}}/>
4439

45-
The final assistant will:
40+
AI around language was getting more and more popular: ChatGPT released in 2022, and multiple companies were releasing APIs around managing text transcription and voice synthesis.
4641

47-
- Process user voice and text input
48-
- Use ElevenLabs API for voice synthesis
49-
- Provide a seamless conversational experience
42+
Despite this, voice-based conversational AI was still in its early stages and voice assistants like the one in ChatGPT were just starting to be released in beta.
5043

51-
Here is a sneak peek of the final assistant in action:
44+
That's why I decided to create my own! Back then, APIs were limited in functionality so I had to build my own pipeline with one tool per task:
45+
- 🎤 PyAudio to record the user's voice
46+
- ⌨️ Deepgram to transcribe the voice to text
47+
- 🤖 OpenAI GPT-3 to generate a response
48+
- 📈 ElevenLabs to convert the response to speech
49+
- 🔊 Pygame to play the response
50+
- 💻 Taipy to display the conversation
51+
- 🤝 And a lot of Python code to glue everything together
5252

53-
<p align="center">
54-
<video width="1280" height="480" controls>
55-
<source src="media/demo.mp4" type="video/mp4" />
56-
</video>
57-
</p>
53+
<ImageZoom src="https://i.imgur.com/bWX2sx8.png" style={{width: "60%", height: "auto"}}/>
54+
55+
It was a lot of work but it was worth it! I released it on [GitHub](https://github.com/AlexandreSajus/JARVIS) and on [YouTube](https://github.com/AlexandreSajus/JARVIS) and it got 24k views, 485 stars and 88 forks! Here's me talking to my chatbot after spending 24 hours without sleep creating it:
56+
57+
<ImageZoom src="https://i.imgur.com/ecC0Tff.png" style={{width: "60%", height: "auto"}}/>
58+
59+
Now, in 2025, I'm excited to revisit this project since APIs have evolved a lot since then. What took 6 different libraries to work in 2023 can now be done with just one API. In this tutorial, we will use the ElevenLabs API to record our voice and play the assistant's response in real time:
60+
61+
<ImageZoom src="https://i.imgur.com/6iGvFsk.gif" style={{width: "60%", height: "auto"}}/>
5862

5963
Let's dive in!
6064

6165
## Setting Up the Environment
6266

63-
### 1. Install Required Packages
67+
### Install Required Packages
6468

6569
Before we start, make sure you have Python installed. Then, install the required dependencies:
6670

6771
```sh
6872
pip install elevenlabs elevenlabs[pyaudio] python-dotenv
6973
```
7074

71-
### 2. Setting up ElevenLabs
75+
Processing audio requires additional dependencies on Linux and MacOS:
7276

73-
ElevenLabs provides a Conversational AI API that we will use to create our Voice Assistant. - The API records the user's voice through the microphone
74-
- It processes it to know when the user has finished speaking or is interrupting the assistant
75-
- It calls an LLM model to generate a response
76-
- It synthesizes the response into speech
77-
- It plays the synthesized speech through the speakers
77+
- For Linux, you need to install `portaudio19`:
78+
```sh
79+
sudo apt install portaudio19
80+
```
81+
- For MacOS, you need to install `portaudio`:
82+
```sh
83+
brew install portaudio
84+
```
85+
86+
### Setting up ElevenLabs
7887

79-
<p align="center">
80-
<img src="https://i.imgur.com/TZpJsMK.png" alt="ElevenLabs Function Diagram" width="80%"/>
81-
</p>
88+
ElevenLabs provides a Conversational AI API that we will use to create our Voice Assistant.
89+
- 🎤 The API records the user's voice through the microphone
90+
- 🖨️ It processes it to know when the user has finished speaking or is interrupting the assistant
91+
- 🤖 It calls an LLM model to generate a response
92+
- 📈 It synthesizes the response into speech
93+
- 🔊 It plays the synthesized speech through the speakers
94+
95+
<ImageZoom src="https://i.imgur.com/QZkz0Rh.png" style={{width: "60%", height: "auto"}}/>
8296

8397
1. Sign up at [ElevenLabs](https://elevenlabs.io/app/sign-up) and follow the instructions to create an account.
8498

8599
2. Once signed in, go to "Conversational AI"
86100

87-
<p align="center">
88-
<img src="https://i.imgur.com/bXHHn0E.png" alt="Conversational AI button on dashboard" width="80%"/>
89-
</p>
101+
<ImageZoom src="https://i.imgur.com/aIYfusq.png" style={{width: "60%", height: "auto"}}/>
90102

91103
3. Go to "Agents"
92104

93-
<p align="center">
94-
<img src="https://i.imgur.com/Hdzof0l.png" alt="Agents button on dashboard" width="80%"/>
95-
</p>
105+
<ImageZoom src="https://i.imgur.com/L9xwBgl.png" style={{width: "60%", height: "auto"}}/>
96106

97107
4. Click on "Start from blank"
98108

99-
<p align="center">
100-
<img src="https://i.imgur.com/uAMmwqB.png" alt="Start from blank button" width="80%"/>
101-
</p>
109+
<ImageZoom src="https://i.imgur.com/PD8v3Ax.png" style={{width: "60%", height: "auto"}}/>
102110

103111
5. Create a ".env" file at the root of your project folder. We will use this file to store our API credentials securely. This way they won't be hardcoded in the script. In this ".env" file, add your Agent ID:
104112

105-
<p align="center">
106-
<img src="https://i.imgur.com/WEfcWZK.png" alt="Agent ID" width="80%"/>
107-
</p>
113+
<ImageZoom src="https://i.imgur.com/vfmMv7r.png" style={{width: "60%", height: "auto"}}/>
108114

109115
```bash
110116
AGENT_ID=your_agent_id
111117
```
112118

113119
6. Go to the "Security" tab, enable the "First message" and "System prompt" overrides, and save. This will allow us to customize the assistant's first message and system prompt using Python code.
114120

115-
<p align="center">
116-
<img src="https://i.imgur.com/ugDFbs1.png" alt="Security tab" width="80%"/>
117-
</p>
121+
<ImageZoom src="https://i.imgur.com/0vfNTOd.png" style={{width: "60%", height: "auto"}}/>
118122

119123
7. Click on your profile and go to "API keys". Create a new API key and copy it to your ".env" file:
120124

121125
```bash
122126
API_KEY="sk_XXX...XXX"
123127
```
124128

125-
<p align="center">
126-
<img src="https://i.imgur.com/lfPxnL8.png" alt="API keys" width="80%"/>
127-
</p>
129+
**Make sure to save your ".env" file after adding the credentials.**
130+
131+
<ImageZoom src="https://i.imgur.com/Q5QrGVl.png" style={{width: "60%", height: "auto"}}/>
128132

129133
ElevenLabs is now set up and ready to be used in our Python script!
130134

131-
Note: ElevenLabs works with a credit system. When you sign up, you get 10,000 free credits which amount to 15 minutes of conversation. You can buy more credits if needed.
135+
**Note:** ElevenLabs works with a credit system. When you sign up, you get 10,000 free credits which amount to 15 minutes of conversation. You can buy more credits if needed.
132136

133137

134138
## Building the Voice Assistant
135139

136-
### 1. Load Environment Variables
140+
### Load Environment Variables
137141

138142
Create a Python file (e.g., `voice_assistant.py`) and load your API credentials:
139143

140-
```python
144+
```py
141145
import os
142146
from dotenv import load_dotenv
143147

@@ -153,29 +157,35 @@ We will set up the ElevenLabs client and configure a conversation instance.
153157

154158
We'll start by importing the necessary modules:
155159

156-
```python
160+
```py
157161
from elevenlabs.client import ElevenLabs
158162
from elevenlabs.conversational_ai.conversation import Conversation
159163
from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface
160164
from elevenlabs.types import ConversationConfig
161165
```
162166

163-
We will then configure the conversation with the agent's first message and system prompt. We are going to inform the assistant that the user has a schedule and prompt it to help the user. In this part you can customize:
164-
- The user's name: what the assistant will call the user
165-
- The schedule: the user's schedule that the assistant will use to provide help
166-
- The prompt: the message that the assistant will receive when the conversation starts to understand the context of the conversation
167-
- The first message: the first message the assistant will say to the user
167+
We will then configure the conversation with the agent's first message and system prompt.
168+
169+
We are going to inform the assistant that the user has a schedule and prompt it to help the user. In this part you can customize:
170+
- The **user's name**: what the assistant will call the user
171+
- The **schedule**: the user's schedule that the assistant will use to provide help
172+
- The **prompt**: the message that the assistant will receive when the conversation starts to understand the context of the conversation
173+
- The **first message**: the first message the assistant will say to the user
168174

169-
```python
175+
**Prompts** are used to provide context to the assistant and help it understand the user's needs.
176+
177+
Here's my example:
178+
179+
```py
170180
user_name = "Alex"
171181
schedule = "Sales Meeting with Taipy at 10:00; Gym with Sophie at 17:00"
172182
prompt = f"You are a helpful assistant. Your interlocutor has the following schedule: {schedule}."
173183
first_message = f"Hello {user_name}, how can I help you today?"
174184
```
175185

176-
We are then going to set this configuration to our ElevenLabs agent:
186+
Underneath in the same file, we are then going to set this configuration to our ElevenLabs agent:
177187

178-
```python
188+
```py
179189
conversation_override = {
180190
"agent": {
181191
"prompt": {
@@ -191,7 +201,7 @@ config = ConversationConfig(
191201
dynamic_variables={},
192202
)
193203

194-
client = ElevenLabs(api_key=API_KEY, timeout=15)
204+
client = ElevenLabs(api_key=API_KEY)
195205
conversation = Conversation(
196206
client,
197207
AGENT_ID,
@@ -203,9 +213,9 @@ conversation = Conversation(
203213

204214
### 3. Implement Callbacks for Responses
205215

206-
To improve user experience, define callback functions to handle assistant responses. These functions will print the assistant's responses and user transcripts. We also define a function to handle the situation where the user interrupts the assistant:
216+
We'll also need to handle assistant responses by printing the assistant's responses and user transcripts, as well as handling the situation where the user interrupts the assistant. We can do so by implementing a few callback functions underneath our configuration.
207217

208-
```python
218+
```py
209219
def print_agent_response(response):
210220
print(f"Agent: {response}")
211221

@@ -218,9 +228,9 @@ def print_user_transcript(transcript):
218228

219229
### 4. Start the Voice Assistant Session
220230

221-
Finally, initiate the conversation session:
231+
Finally, initiate the conversation session in the same file:
222232

223-
```python
233+
```py
224234
conversation = Conversation(
225235
client,
226236
AGENT_ID,
@@ -237,6 +247,8 @@ conversation.start_session()
237247

238248
## Running the Assistant
239249

250+
**Please make sure your audio devices are correctly set up in your system settings before running the code.**
251+
240252
Execute the script:
241253

242254
```bash
@@ -245,15 +257,20 @@ python voice_assistant.py
245257

246258
The assistant will start listening for input and responding in real time!
247259

260+
You can stop the assistant at any time by closing the terminal.
261+
248262
## Conclusion
249263

250-
Congratulations! 🎉 You've successfully built a Voice Virtual Assistant using ElevenLabs API. You can extend its capabilities by integrating it with home automation, calendars, or other APIs to make it even more useful.
264+
Congratulations! 🎉
265+
266+
You've successfully built a Voice Virtual Assistant using ElevenLabs API. You can extend its capabilities by integrating it with home automation, calendars, or other APIs to make it even more useful.
251267

252268
Stay creative and keep experimenting with AI-powered assistants!
253269

254270
## More Resources
255271

272+
- [Source Code](TO DO)
256273
- ElevenLabs Conversational AI [Overview](https://elevenlabs.io/docs/conversational-ai/overview)
257274
- ElevenLabs Python SDK [Documentation](https://elevenlabs.io/docs/conversational-ai/libraries/python)
258275
- Enable your assistant to execute Python functions with [Client Tools](https://elevenlabs.io/docs/conversational-ai/customization/tools-events/client-tools)
259-
- Provide documents as context to your assistant with [RAG](https://elevenlabs.io/docs/conversational-ai/customization/knowledge-base/rag)
276+
- Provide documents as context to your assistant with [RAG](https://elevenlabs.io/docs/conversational-ai/customization/knowledge-base/rag)
715 KB
Loading

0 commit comments

Comments
 (0)