You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Voice assistants like Siri, Google Assistant, and Alexa have revolutionized the way we interact with technology. In this tutorial, we’ll learn how to create a Voice Virtual Assistant using [ElevenLabs](https://elevenlabs.io/) API and Python. This assistant will be able to engage in natural conversations and provide helpful responses in real time.
34
+
Hey!👋 I'm Alex! Since this is my first tutorial on Codédex, I wanted to revisit a very popular project I made in 2023: creating a voice virtual assistant using Python.
35
+
36
+
Back in 2023, I had just graduated with a Master's of Science in Artificial Intelligence and I had already made a few fun AI projects like detecting vehicles in video games using object detection or teaching 3D characters to jump over obstacles using reinforcement learning (You can check them out on my [YouTube channel](https://www.youtube.com/@alexandresajus/videos)).
AI around language was getting more and more popular: ChatGPT released in 2022, and multiple companies were releasing APIs around managing text transcription and voice synthesis.
46
41
47
-
- Process user voice and text input
48
-
- Use ElevenLabs API for voice synthesis
49
-
- Provide a seamless conversational experience
42
+
Despite this, voice-based conversational AI was still in its early stages and voice assistants like the one in ChatGPT were just starting to be released in beta.
50
43
51
-
Here is a sneak peek of the final assistant in action:
44
+
That's why I decided to create my own! Back then, APIs were limited in functionality so I had to build my own pipeline with one tool per task:
45
+
- 🎤 PyAudio to record the user's voice
46
+
- ⌨️ Deepgram to transcribe the voice to text
47
+
- 🤖 OpenAI GPT-3 to generate a response
48
+
- 📈 ElevenLabs to convert the response to speech
49
+
- 🔊 Pygame to play the response
50
+
- 💻 Taipy to display the conversation
51
+
- 🤝 And a lot of Python code to glue everything together
It was a lot of work but it was worth it! I released it on [GitHub](https://github.com/AlexandreSajus/JARVIS) and on [YouTube](https://github.com/AlexandreSajus/JARVIS) and it got 24k views, 485 stars and 88 forks! Here's me talking to my chatbot after spending 24 hours without sleep creating it:
Now, in 2025, I'm excited to revisit this project since APIs have evolved a lot since then. What took 6 different libraries to work in 2023 can now be done with just one API. In this tutorial, we will use the ElevenLabs API to record our voice and play the assistant's response in real time:
5. Create a ".env" file at the root of your project folder. We will use this file to store our API credentials securely. This way they won't be hardcoded in the script. In this ".env" file, add your Agent ID:
6. Go to the "Security" tab, enable the "First message" and "System prompt" overrides, and save. This will allow us to customize the assistant's first message and system prompt using Python code.
ElevenLabs is now set up and ready to be used in our Python script!
130
134
131
-
Note: ElevenLabs works with a credit system. When you sign up, you get 10,000 free credits which amount to 15 minutes of conversation. You can buy more credits if needed.
135
+
**Note:** ElevenLabs works with a credit system. When you sign up, you get 10,000 free credits which amount to 15 minutes of conversation. You can buy more credits if needed.
132
136
133
137
134
138
## Building the Voice Assistant
135
139
136
-
### 1. Load Environment Variables
140
+
### Load Environment Variables
137
141
138
142
Create a Python file (e.g., `voice_assistant.py`) and load your API credentials:
139
143
140
-
```python
144
+
```py
141
145
import os
142
146
from dotenv import load_dotenv
143
147
@@ -153,29 +157,35 @@ We will set up the ElevenLabs client and configure a conversation instance.
153
157
154
158
We'll start by importing the necessary modules:
155
159
156
-
```python
160
+
```py
157
161
from elevenlabs.client import ElevenLabs
158
162
from elevenlabs.conversational_ai.conversation import Conversation
159
163
from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface
160
164
from elevenlabs.types import ConversationConfig
161
165
```
162
166
163
-
We will then configure the conversation with the agent's first message and system prompt. We are going to inform the assistant that the user has a schedule and prompt it to help the user. In this part you can customize:
164
-
- The user's name: what the assistant will call the user
165
-
- The schedule: the user's schedule that the assistant will use to provide help
166
-
- The prompt: the message that the assistant will receive when the conversation starts to understand the context of the conversation
167
-
- The first message: the first message the assistant will say to the user
167
+
We will then configure the conversation with the agent's first message and system prompt.
168
+
169
+
We are going to inform the assistant that the user has a schedule and prompt it to help the user. In this part you can customize:
170
+
- The **user's name**: what the assistant will call the user
171
+
- The **schedule**: the user's schedule that the assistant will use to provide help
172
+
- The **prompt**: the message that the assistant will receive when the conversation starts to understand the context of the conversation
173
+
- The **first message**: the first message the assistant will say to the user
168
174
169
-
```python
175
+
**Prompts** are used to provide context to the assistant and help it understand the user's needs.
176
+
177
+
Here's my example:
178
+
179
+
```py
170
180
user_name ="Alex"
171
181
schedule ="Sales Meeting with Taipy at 10:00; Gym with Sophie at 17:00"
172
182
prompt =f"You are a helpful assistant. Your interlocutor has the following schedule: {schedule}."
173
183
first_message =f"Hello {user_name}, how can I help you today?"
174
184
```
175
185
176
-
We are then going to set this configuration to our ElevenLabs agent:
186
+
Underneath in the same file, we are then going to set this configuration to our ElevenLabs agent:
177
187
178
-
```python
188
+
```py
179
189
conversation_override = {
180
190
"agent": {
181
191
"prompt": {
@@ -191,7 +201,7 @@ config = ConversationConfig(
191
201
dynamic_variables={},
192
202
)
193
203
194
-
client = ElevenLabs(api_key=API_KEY, timeout=15)
204
+
client = ElevenLabs(api_key=API_KEY)
195
205
conversation = Conversation(
196
206
client,
197
207
AGENT_ID,
@@ -203,9 +213,9 @@ conversation = Conversation(
203
213
204
214
### 3. Implement Callbacks for Responses
205
215
206
-
To improve user experience, define callback functions to handle assistant responses. These functions will print the assistant's responses and user transcripts. We also define a function to handle the situation where the user interrupts the assistant:
216
+
We'll also need to handle assistant responses by printing the assistant's responses and user transcripts, as well as handling the situation where the user interrupts the assistant. We can do so by implementing a few callback functions underneath our configuration.
Finally, initiate the conversation session in the same file:
222
232
223
-
```python
233
+
```py
224
234
conversation = Conversation(
225
235
client,
226
236
AGENT_ID,
@@ -237,6 +247,8 @@ conversation.start_session()
237
247
238
248
## Running the Assistant
239
249
250
+
**Please make sure your audio devices are correctly set up in your system settings before running the code.**
251
+
240
252
Execute the script:
241
253
242
254
```bash
@@ -245,15 +257,20 @@ python voice_assistant.py
245
257
246
258
The assistant will start listening for input and responding in real time!
247
259
260
+
You can stop the assistant at any time by closing the terminal.
261
+
248
262
## Conclusion
249
263
250
-
Congratulations! 🎉 You've successfully built a Voice Virtual Assistant using ElevenLabs API. You can extend its capabilities by integrating it with home automation, calendars, or other APIs to make it even more useful.
264
+
Congratulations! 🎉
265
+
266
+
You've successfully built a Voice Virtual Assistant using ElevenLabs API. You can extend its capabilities by integrating it with home automation, calendars, or other APIs to make it even more useful.
251
267
252
268
Stay creative and keep experimenting with AI-powered assistants!
253
269
254
270
## More Resources
255
271
272
+
-[Source Code](TO DO)
256
273
- ElevenLabs Conversational AI [Overview](https://elevenlabs.io/docs/conversational-ai/overview)
- Enable your assistant to execute Python functions with [Client Tools](https://elevenlabs.io/docs/conversational-ai/customization/tools-events/client-tools)
259
-
- Provide documents as context to your assistant with [RAG](https://elevenlabs.io/docs/conversational-ai/customization/knowledge-base/rag)
276
+
- Provide documents as context to your assistant with [RAG](https://elevenlabs.io/docs/conversational-ai/customization/knowledge-base/rag)
0 commit comments