|
| 1 | +--- |
| 2 | +title: Create a Voice Virtual Assistant |
| 3 | +author: Alexandre Sajus |
| 4 | +uid: |
| 5 | +datePublished: |
| 6 | +published: false |
| 7 | +description: Learn how to build a Voice Virtual Assistant using ElevenLabs API and Python for seamless AI-powered interactions. |
| 8 | +header: |
| 9 | +tags: |
| 10 | + - Intermediate |
| 11 | + - Python |
| 12 | + - AI |
| 13 | +--- |
| 14 | + |
| 15 | +<BannerImage |
| 16 | + link="" |
| 17 | + description="Title Image" |
| 18 | + uid={true} |
| 19 | + cl="for-sidebar" |
| 20 | +/> |
| 21 | + |
| 22 | +# Build a Conversational Pong Game in p5.js |
| 23 | + |
| 24 | +<AuthorAvatar |
| 25 | + author_name="Alexandre Sajus" |
| 26 | + author_avatar="https://i.imgur.com/fhkMdV4.jpeg" |
| 27 | + username="" |
| 28 | + uid={true} |
| 29 | +/> |
| 30 | + |
| 31 | +<BannerImage |
| 32 | + link="" |
| 33 | + description="Banner" |
| 34 | + uid={true} |
| 35 | +/> |
| 36 | + |
| 37 | +**Prerequisites:** Python |
| 38 | +**Versions:** Python 3.11, python-dotenv 1.0.1, elevenlabs 1.54.0, elevenlabs[pyaudio] |
| 39 | +**Read Time:** 60 minutes |
| 40 | + |
| 41 | +## Introduction |
| 42 | + |
| 43 | +Voice assistants like Siri, Google Assistant, and Alexa have revolutionized the way we interact with technology. In this tutorial, we’ll learn how to create a Voice Virtual Assistant using [ElevenLabs](https://elevenlabs.io/) API and Python. This assistant will be able to engage in natural conversations and provide helpful responses in real time. |
| 44 | + |
| 45 | +The final assistant will: |
| 46 | + |
| 47 | +- Process user voice and text input |
| 48 | +- Use ElevenLabs API for voice synthesis |
| 49 | +- Provide a seamless conversational experience |
| 50 | + |
| 51 | +Here is a sneak peek of the final assistant in action: |
| 52 | + |
| 53 | +<iframe |
| 54 | + width="1280" |
| 55 | + align="center" |
| 56 | + height="480" |
| 57 | + src="https://i.imgur.com/83XuIPB.mp4" |
| 58 | + title="" |
| 59 | + allowFullScreen |
| 60 | +></iframe> |
| 61 | + |
| 62 | +Let's dive in! |
| 63 | + |
| 64 | +## Setting Up the Environment |
| 65 | + |
| 66 | +### 1. Install Required Packages |
| 67 | + |
| 68 | +Before we start, make sure you have Python installed. Then, install the required dependencies: |
| 69 | + |
| 70 | +```sh |
| 71 | +pip install elevenlabs elevenlabs[pyaudio] python-dotenv |
| 72 | +``` |
| 73 | + |
| 74 | +### 2. Setting up ElevenLabs |
| 75 | + |
| 76 | +ElevenLabs provides a Conversational AI API that we will use to create our Voice Assistant. - The API records the user's voice through the microphone |
| 77 | +- It processes it to know when the user has finished speaking or is interrupting the assistant |
| 78 | +- It calls an LLM model to generate a response |
| 79 | +- It synthesizes the response into speech |
| 80 | +- It plays the synthesized speech through the speakers |
| 81 | + |
| 82 | +<p align="center"> |
| 83 | + <img src="https://i.imgur.com/TZpJsMK.png" alt="ElevenLabs Function Diagram" width="80%"/> |
| 84 | +</p> |
| 85 | + |
| 86 | +1. Sign up at [ElevenLabs](https://elevenlabs.io/app/sign-up) and follow the instructions to create an account. |
| 87 | + |
| 88 | +2. Once signed in, go to "Conversational AI" |
| 89 | + |
| 90 | +<p align="center"> |
| 91 | + <img src="https://i.imgur.com/bXHHn0E.png" alt="Conversational AI button on dashboard" width="80%"/> |
| 92 | +</p> |
| 93 | + |
| 94 | +3. Go to "Agents" |
| 95 | + |
| 96 | +<p align="center"> |
| 97 | + <img src="https://i.imgur.com/Hdzof0l.png" alt="Agents button on dashboard" width="80%"/> |
| 98 | +</p> |
| 99 | + |
| 100 | +4. Click on "Start from blank" |
| 101 | + |
| 102 | +<p align="center"> |
| 103 | + <img src="https://i.imgur.com/uAMmwqB.png" alt="Start from blank button" width="80%"/> |
| 104 | +</p> |
| 105 | + |
| 106 | +5. Create a ".env" file at the root of your project folder. We will use this file to store our API credentials securely. This way they won't be hardcoded in the script. In this ".env" file, add your Agent ID: |
| 107 | + |
| 108 | +<p align="center"> |
| 109 | + <img src="https://i.imgur.com/WEfcWZK.png" alt="Agent ID" width="80%"/> |
| 110 | +</p> |
| 111 | + |
| 112 | +```bash |
| 113 | +AGENT_ID=your_agent_id |
| 114 | +``` |
| 115 | + |
| 116 | +6. Go to the "Security" tab, enable the "First message" and "System prompt" overrides, and save. This will allow us to customize the assistant's first message and system prompt using Python code. |
| 117 | + |
| 118 | +<p align="center"> |
| 119 | + <img src="https://i.imgur.com/ugDFbs1.png" alt="Security tab" width="80%"/> |
| 120 | +</p> |
| 121 | + |
| 122 | +7. Click on your profile and go to "API keys". Create a new API key and copy it to your ".env" file: |
| 123 | + |
| 124 | +```bash |
| 125 | +API_KEY="sk_XXX...XXX" |
| 126 | +``` |
| 127 | + |
| 128 | +<p align="center"> |
| 129 | + <img src="https://i.imgur.com/lfPxnL8.png" alt="API keys" width="80%"/> |
| 130 | +</p> |
| 131 | + |
| 132 | +ElevenLabs is now set up and ready to be used in our Python script! |
| 133 | + |
| 134 | +Note: ElevenLabs works with a credit system. When you sign up, you get 10,000 free credits which amount to 15 minutes of conversation. You can buy more credits if needed. |
| 135 | + |
| 136 | + |
| 137 | +## Building the Voice Assistant |
| 138 | + |
| 139 | +### 1. Load Environment Variables |
| 140 | + |
| 141 | +Create a Python file (e.g., `voice_assistant.py`) and load your API credentials: |
| 142 | + |
| 143 | +```python |
| 144 | +import os |
| 145 | +from dotenv import load_dotenv |
| 146 | + |
| 147 | +load_dotenv() |
| 148 | + |
| 149 | +AGENT_ID = os.getenv("AGENT_ID") |
| 150 | +API_KEY = os.getenv("API_KEY") |
| 151 | +``` |
| 152 | + |
| 153 | +### 2. Configure ElevenLabs Conversation API |
| 154 | + |
| 155 | +We will set up the ElevenLabs client and configure a conversation instance. |
| 156 | + |
| 157 | +We'll start by importing the necessary modules: |
| 158 | + |
| 159 | +```python |
| 160 | +from elevenlabs.client import ElevenLabs |
| 161 | +from elevenlabs.conversational_ai.conversation import Conversation |
| 162 | +from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface |
| 163 | +from elevenlabs.types import ConversationConfig |
| 164 | +``` |
| 165 | + |
| 166 | +We will then configure the conversation with the agent's first message and system prompt. We are going to inform the assistant that the user has a schedule and prompt it to help the user. In this part you can customize: |
| 167 | +- The user's name: what the assistant will call the user |
| 168 | +- The schedule: the user's schedule that the assistant will use to provide help |
| 169 | +- The prompt: the message that the assistant will receive when the conversation starts to understand the context of the conversation |
| 170 | +- The first message: the first message the assistant will say to the user |
| 171 | + |
| 172 | +```python |
| 173 | +user_name = "Alex" |
| 174 | +schedule = "Sales Meeting with Taipy at 10:00; Gym with Sophie at 17:00" |
| 175 | +prompt = f"You are a helpful assistant. Your interlocutor has the following schedule: {schedule}." |
| 176 | +first_message = f"Hello {user_name}, how can I help you today?" |
| 177 | +``` |
| 178 | + |
| 179 | +We are then going to set this configuration to our ElevenLabs agent: |
| 180 | + |
| 181 | +```python |
| 182 | +conversation_override = { |
| 183 | + "agent": { |
| 184 | + "prompt": { |
| 185 | + "prompt": prompt, |
| 186 | + }, |
| 187 | + "first_message": first_message, |
| 188 | + }, |
| 189 | +} |
| 190 | + |
| 191 | +config = ConversationConfig( |
| 192 | + conversation_config_override=conversation_override, |
| 193 | + extra_body={}, |
| 194 | + dynamic_variables={}, |
| 195 | +) |
| 196 | + |
| 197 | +client = ElevenLabs(api_key=API_KEY, timeout=15) |
| 198 | +conversation = Conversation( |
| 199 | + client, |
| 200 | + AGENT_ID, |
| 201 | + config=config, |
| 202 | + requires_auth=True, |
| 203 | + audio_interface=DefaultAudioInterface(), |
| 204 | +) |
| 205 | +``` |
| 206 | + |
| 207 | +### 3. Implement Callbacks for Responses |
| 208 | + |
| 209 | +To improve user experience, define callback functions to handle assistant responses. These functions will print the assistant's responses and user transcripts. We also define a function to handle the situation where the user interrupts the assistant: |
| 210 | + |
| 211 | +```python |
| 212 | +def print_agent_response(response): |
| 213 | + print(f"Agent: {response}") |
| 214 | + |
| 215 | +def print_interrupted_response(original, corrected): |
| 216 | + print(f"Agent interrupted, truncated response: {corrected}") |
| 217 | + |
| 218 | +def print_user_transcript(transcript): |
| 219 | + print(f"User: {transcript}") |
| 220 | +``` |
| 221 | + |
| 222 | +### 4. Start the Voice Assistant Session |
| 223 | + |
| 224 | +Finally, initiate the conversation session: |
| 225 | + |
| 226 | +```python |
| 227 | +conversation = Conversation( |
| 228 | + client, |
| 229 | + AGENT_ID, |
| 230 | + config=config, |
| 231 | + requires_auth=True, |
| 232 | + audio_interface=DefaultAudioInterface(), |
| 233 | + callback_agent_response=print_agent_response, |
| 234 | + callback_agent_response_correction=print_interrupted_response, |
| 235 | + callback_user_transcript=print_user_transcript, |
| 236 | +) |
| 237 | + |
| 238 | +conversation.start_session() |
| 239 | +``` |
| 240 | + |
| 241 | +## Running the Assistant |
| 242 | + |
| 243 | +Execute the script: |
| 244 | + |
| 245 | +```bash |
| 246 | +python voice_assistant.py |
| 247 | +``` |
| 248 | + |
| 249 | +The assistant will start listening for input and responding in real time! |
| 250 | + |
| 251 | +## Conclusion |
| 252 | + |
| 253 | +Congratulations! 🎉 You've successfully built a Voice Virtual Assistant using ElevenLabs API. You can extend its capabilities by integrating it with home automation, calendars, or other APIs to make it even more useful. |
| 254 | + |
| 255 | +Stay creative and keep experimenting with AI-powered assistants! |
| 256 | + |
| 257 | +## More Resources |
| 258 | + |
| 259 | +- ElevenLabs Conversational AI [Overview](https://elevenlabs.io/docs/conversational-ai/overview) |
| 260 | +- ElevenLabs Python SDK [Documentation](https://elevenlabs.io/docs/conversational-ai/libraries/python) |
| 261 | +- Enable your assistant to execute Python functions with [Client Tools](https://elevenlabs.io/docs/conversational-ai/customization/tools-events/client-tools) |
| 262 | +- Provide documents as context to your assistant with [RAG](https://elevenlabs.io/docs/conversational-ai/customization/knowledge-base/rag) |
0 commit comments