|
| 1 | +# Sesame AI Python Client |
| 2 | + |
| 3 | +An unofficial Python client library for interacting with the [Sesame](https://www.sesame.com) voice conversation API. This package provides easy access to Sesame's voice-based AI characters, allowing developers to create applications with natural voice conversations. |
| 4 | + |
| 5 | +## About Sesame |
| 6 | + |
| 7 | +Sesame is developing conversational AI with "voice presence" - the quality that makes spoken interactions feel real, understood, and valued. Their technology enables voice conversations with AI characters like Miles and Maya that feature emotional intelligence, natural conversational dynamics, and contextual awareness. |
| 8 | + |
| 9 | +## Support |
| 10 | + |
| 11 | +If you find this project helpful, consider buying me a coffee! |
| 12 | + |
| 13 | +[](https://buymeacoffee.com/ijub) |
| 14 | + |
| 15 | +## Installation |
| 16 | + |
| 17 | +```bash |
| 18 | +# From GitHub |
| 19 | +pip install git+https://github.com/ijub/sesame_ai.git |
| 20 | + |
| 21 | +# For development |
| 22 | +git clone https://github.com/ijub/sesame_ai.git |
| 23 | +cd sesame_ai |
| 24 | +pip install -e . |
| 25 | +``` |
| 26 | + |
| 27 | +## Features |
| 28 | + |
| 29 | +- Authentication and account management |
| 30 | +- WebSocket-based real-time voice conversations |
| 31 | +- Token management and refresh |
| 32 | +- Support for multiple AI characters (Miles, Maya) |
| 33 | +- Voice activity detection |
| 34 | +- Simple and intuitive API |
| 35 | + |
| 36 | +## Available Characters |
| 37 | + |
| 38 | +The API supports multiple AI characters: |
| 39 | + |
| 40 | +- **Miles**: A male character (default) |
| 41 | +- **Maya**: A female character |
| 42 | + |
| 43 | +## Quick Start |
| 44 | + |
| 45 | +### Authentication |
| 46 | + |
| 47 | +```python |
| 48 | +from sesame_ai import SesameAI, TokenManager |
| 49 | + |
| 50 | +# Create API client |
| 51 | +client = SesameAI() |
| 52 | + |
| 53 | +# Create an anonymous account |
| 54 | +signup_response = client.create_anonymous_account() |
| 55 | +print(f"ID Token: {signup_response.id_token}") |
| 56 | + |
| 57 | +# Look up account information |
| 58 | +lookup_response = client.get_account_info(signup_response.id_token) |
| 59 | +print(f"User ID: {lookup_response.local_id}") |
| 60 | + |
| 61 | +# For easier token management, use TokenManager |
| 62 | +token_manager = TokenManager(client, token_file="token.json") |
| 63 | +id_token = token_manager.get_valid_token() |
| 64 | +``` |
| 65 | + |
| 66 | +### Voice Chat Example |
| 67 | + |
| 68 | +```python |
| 69 | +from sesame_ai import SesameAI, SesameWebSocket, TokenManager |
| 70 | +import pyaudio |
| 71 | +import threading |
| 72 | +import time |
| 73 | +import numpy as np |
| 74 | + |
| 75 | +# Get authentication token using TokenManager |
| 76 | +api_client = SesameAI() |
| 77 | +token_manager = TokenManager(api_client, token_file="token.json") |
| 78 | +id_token = token_manager.get_valid_token() |
| 79 | + |
| 80 | +# Connect to WebSocket (choose character: "Miles" or "Maya") |
| 81 | +ws = SesameWebSocket(id_token=id_token, character="Maya") |
| 82 | + |
| 83 | +# Set up connection callbacks |
| 84 | +def on_connect(): |
| 85 | + print("Connected to SesameAI!") |
| 86 | + |
| 87 | +def on_disconnect(): |
| 88 | + print("Disconnected from SesameAI") |
| 89 | + |
| 90 | +ws.set_connect_callback(on_connect) |
| 91 | +ws.set_disconnect_callback(on_disconnect) |
| 92 | + |
| 93 | +# Connect to the server |
| 94 | +ws.connect() |
| 95 | + |
| 96 | +# Audio settings |
| 97 | +CHUNK = 1024 |
| 98 | +FORMAT = pyaudio.paInt16 |
| 99 | +CHANNELS = 1 |
| 100 | +RATE = 16000 |
| 101 | + |
| 102 | +# Initialize PyAudio |
| 103 | +p = pyaudio.PyAudio() |
| 104 | + |
| 105 | +# Open microphone stream |
| 106 | +mic_stream = p.open(format=FORMAT, |
| 107 | + channels=CHANNELS, |
| 108 | + rate=RATE, |
| 109 | + input=True, |
| 110 | + frames_per_buffer=CHUNK) |
| 111 | + |
| 112 | +# Open speaker stream (using server's sample rate) |
| 113 | +speaker_stream = p.open(format=FORMAT, |
| 114 | + channels=CHANNELS, |
| 115 | + rate=ws.server_sample_rate, |
| 116 | + output=True) |
| 117 | + |
| 118 | +# Function to capture and send microphone audio |
| 119 | +def capture_microphone(): |
| 120 | + print("Microphone capture started...") |
| 121 | + try: |
| 122 | + while True: |
| 123 | + if ws.is_connected(): |
| 124 | + data = mic_stream.read(CHUNK, exception_on_overflow=False) |
| 125 | + ws.send_audio_data(data) |
| 126 | + else: |
| 127 | + time.sleep(0.1) |
| 128 | + except KeyboardInterrupt: |
| 129 | + print("Microphone capture stopped") |
| 130 | + |
| 131 | +# Function to play received audio |
| 132 | +def play_audio(): |
| 133 | + print("Audio playback started...") |
| 134 | + try: |
| 135 | + while True: |
| 136 | + audio_chunk = ws.get_next_audio_chunk(timeout=0.01) |
| 137 | + if audio_chunk: |
| 138 | + speaker_stream.write(audio_chunk) |
| 139 | + except KeyboardInterrupt: |
| 140 | + print("Audio playback stopped") |
| 141 | + |
| 142 | +# Start audio threads |
| 143 | +mic_thread = threading.Thread(target=capture_microphone) |
| 144 | +mic_thread.daemon = True |
| 145 | +mic_thread.start() |
| 146 | + |
| 147 | +playback_thread = threading.Thread(target=play_audio) |
| 148 | +playback_thread.daemon = True |
| 149 | +playback_thread.start() |
| 150 | + |
| 151 | +# Keep the main thread alive |
| 152 | +try: |
| 153 | + while True: |
| 154 | + time.sleep(1) |
| 155 | +except KeyboardInterrupt: |
| 156 | + print("Disconnecting...") |
| 157 | + ws.disconnect() |
| 158 | + mic_stream.stop_stream() |
| 159 | + mic_stream.close() |
| 160 | + speaker_stream.stop_stream() |
| 161 | + speaker_stream.close() |
| 162 | + p.terminate() |
| 163 | +``` |
| 164 | + |
| 165 | +The package also includes a full-featured voice chat example that you can run: |
| 166 | + |
| 167 | +```bash |
| 168 | +# Chat with Miles (default) |
| 169 | +python examples/voice_chat.py |
| 170 | + |
| 171 | +# Chat with Maya |
| 172 | +python examples/voice_chat.py --character Maya |
| 173 | +``` |
| 174 | + |
| 175 | +Command-line options: |
| 176 | +- `--character`: Character to chat with (default: Miles, options: Miles, Maya) |
| 177 | +- `--input-device`: Input device index |
| 178 | +- `--output-device`: Output device index |
| 179 | +- `--list-devices`: List audio devices and exit |
| 180 | +- `--token-file`: Path to token storage file |
| 181 | +- `--debug`: Enable debug logging |
| 182 | + |
| 183 | +## API Reference |
| 184 | + |
| 185 | +### SesameAI |
| 186 | + |
| 187 | +The main API client for authentication. |
| 188 | + |
| 189 | +- `SesameAI(api_key=None)` - Create a new API client |
| 190 | +- `create_anonymous_account()` - Create an anonymous account |
| 191 | +- `get_account_info(id_token)` - Look up account information |
| 192 | +- `refresh_authentication_token(refresh_token)` - Refresh an ID token |
| 193 | + |
| 194 | +### TokenManager |
| 195 | + |
| 196 | +Manages authentication tokens with automatic refresh and persistence. |
| 197 | + |
| 198 | +- `TokenManager(api_client=None, token_file=None)` - Create a token manager |
| 199 | +- `get_valid_token(force_new=False)` - Get a valid token, refreshing if needed |
| 200 | +- `clear_tokens()` - Clear stored tokens |
| 201 | + |
| 202 | +### SesameWebSocket |
| 203 | + |
| 204 | +WebSocket client for real-time voice conversation. |
| 205 | + |
| 206 | +- `SesameWebSocket(id_token, character="Miles", client_name="RP-Web")` - Create a new WebSocket client |
| 207 | +- `connect(blocking=True)` - Connect to the server |
| 208 | +- `send_audio_data(raw_audio_bytes)` - Send raw audio data |
| 209 | +- `get_next_audio_chunk(timeout=None)` - Get the next audio chunk |
| 210 | +- `disconnect()` - Disconnect from the server |
| 211 | +- `is_connected()` - Check if connected |
| 212 | + |
| 213 | +## Error Handling |
| 214 | + |
| 215 | +The library provides several exception classes for error handling: |
| 216 | + |
| 217 | +- `SesameAIError` - Base exception class |
| 218 | +- `InvalidTokenError` - Invalid token errors |
| 219 | +- `APIError` - API errors with code and message |
| 220 | +- `NetworkError` - Network communication errors |
| 221 | + |
| 222 | +Example: |
| 223 | + |
| 224 | +```python |
| 225 | +from sesame_ai import SesameAI, InvalidTokenError, APIError, NetworkError |
| 226 | + |
| 227 | +client = SesameAI() |
| 228 | + |
| 229 | +try: |
| 230 | + # Try to use an invalid token |
| 231 | + client.get_account_info("invalid_token") |
| 232 | +except InvalidTokenError: |
| 233 | + print("The token is invalid or expired") |
| 234 | +except APIError as e: |
| 235 | + print(f"API error: {e.code} - {e.message}") |
| 236 | +except NetworkError as e: |
| 237 | + print(f"Network error: {e}") |
| 238 | +``` |
| 239 | + |
| 240 | +## Troubleshooting |
| 241 | + |
| 242 | +### Audio Device Problems |
| 243 | + |
| 244 | +If you encounter audio device issues: |
| 245 | + |
| 246 | +1. Use `--list-devices` to see available audio devices |
| 247 | +2. Specify input/output devices with `--input-device` and `--output-device` |
| 248 | +3. Ensure PyAudio is properly installed with all dependencies |
| 249 | + |
| 250 | +### Audio Feedback Issues |
| 251 | + |
| 252 | +Currently, the voice chat example doesn't block audio coming from the AI (through your speakers) from being picked up by your microphone, which can cause feedback loops. For the best experience: |
| 253 | + |
| 254 | +1. Use headphones to prevent the AI from hearing itself |
| 255 | +2. Keep speaker volume at a moderate level |
| 256 | +3. Position your microphone away from speakers if not using headphones |
| 257 | + |
| 258 | +**Note:** I'm working on updating the `voice_chat.py` example to implement echo cancellation and audio filtering to address this issue in a future update. |
| 259 | + |
| 260 | +### Connection Issues |
| 261 | + |
| 262 | +If you have trouble connecting: |
| 263 | + |
| 264 | +1. Check your internet connection |
| 265 | +2. Verify your authentication token is valid |
| 266 | +3. Ensure the SesameAI service is available |
| 267 | + |
| 268 | +## Legal Disclaimer |
| 269 | + |
| 270 | +This is an unofficial API wrapper and is not affiliated with, maintained, authorized, endorsed, or sponsored by Sesame. or any of its affiliates. This wrapper is intended for personal, educational, and non-commercial use only. |
| 271 | + |
| 272 | +Users of this library assume all legal responsibility for its use. The author(s) are not responsible for any violations of Sesame Terms of Service or applicable laws. |
| 273 | + |
| 274 | +## License |
| 275 | + |
| 276 | +This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. |
| 277 | + |
| 278 | +## Support |
| 279 | + |
| 280 | +If you find this project helpful, consider buying me a coffee! |
| 281 | + |
| 282 | +<a href="https://buymeacoffee.com/ijub" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a> |
0 commit comments