Skip to content

Commit 2e7678e

Browse files
authored
Voice Call System: Production STT/TTS with AI Participant Integration
Comprehensive voice call system with configurable Whisper models, automated downloads, VAD improvements, and AI participant integration. See PR description for full details.
1 parent 5a438bb commit 2e7678e

File tree

277 files changed

+41636
-822
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

277 files changed

+41636
-822
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,8 +130,10 @@ ai-venv/
130130
continuum.log
131131
continuum_restart.log
132132

133-
# Generated documentation
133+
# Generated documentation (but keep docs/ AI files)
134134
*-AI-*.md
135+
!src/**/docs/*-AI-*.md
136+
!docs/*-AI-*.md
135137
generated-*/
136138
*.md.backup
137139
*.md.test-backup

CLAUDE.md

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,99 @@ When you touch any code, improve it. Don't just add your feature and leave the m
130130

131131
---
132132

133+
## 🧵 OFF-MAIN-THREAD PRINCIPLE (Non-Negotiable)
134+
135+
**NEVER put CPU-intensive work on the main thread. No exceptions.**
136+
137+
This has been the standard since **Grand Central Dispatch (2009)**, then **pthreads**, then **Web Workers**. Every modern SDK does all heavy work off the main thread. This is not optional.
138+
139+
### The Rule
140+
141+
| Work Type | Where It Goes | NOT Main Thread |
142+
|-----------|---------------|-----------------|
143+
| Audio processing | `AudioWorklet` (Web) or Rust worker | ❌ ScriptProcessorNode |
144+
| Video processing | Web Worker with transferable buffers | ❌ Canvas on main thread |
145+
| AI inference | Rust worker via Unix socket | ❌ WASM on main thread |
146+
| Image processing | Rust worker or Web Worker | ❌ Direct manipulation |
147+
| File I/O | Rust worker | ❌ Synchronous reads |
148+
| Crypto | Web Crypto API (already off-thread) | ❌ JS crypto libs |
149+
| Search/indexing | Rust worker | ❌ JS array operations |
150+
151+
### Browser: Use AudioWorklet and Web Workers
152+
153+
```typescript
154+
// ✅ CORRECT - AudioWorklet runs on audio rendering thread
155+
const workletUrl = new URL('./audio-worklet-processor.js', import.meta.url).href;
156+
await audioContext.audioWorklet.addModule(workletUrl);
157+
const workletNode = new AudioWorkletNode(audioContext, 'microphone-processor');
158+
159+
// ✅ CORRECT - Transfer buffers (zero-copy)
160+
workletNode.port.onmessage = (event) => {
161+
// event.data is the Float32Array, transferred not copied
162+
sendToServer(event.data);
163+
};
164+
165+
// In the worklet processor:
166+
this.port.postMessage(frame, [frame.buffer]); // Transfer ownership
167+
168+
// ❌ WRONG - ScriptProcessorNode (deprecated, runs on main thread)
169+
const scriptNode = audioContext.createScriptProcessor(4096, 1, 1);
170+
scriptNode.onaudioprocess = (e) => { /* BLOCKS MAIN THREAD */ };
171+
```
172+
173+
### Server: Use Rust Workers
174+
175+
```typescript
176+
// ✅ CORRECT - Heavy compute in Rust via Unix socket
177+
const result = await Commands.execute('ai/embedding/generate', { text });
178+
// Rust worker does the work, main thread stays responsive
179+
180+
// ❌ WRONG - Heavy compute in Node.js main thread
181+
const embedding = computeEmbedding(text); // BLOCKS EVENT LOOP
182+
```
183+
184+
### Transferable Objects (Zero-Copy)
185+
186+
Audio and video buffers can be **transferred** between threads without copying:
187+
188+
```typescript
189+
// ✅ CORRECT - Transfer the ArrayBuffer (zero-copy)
190+
worker.postMessage(audioBuffer, [audioBuffer.buffer]);
191+
192+
// ❌ WRONG - Copy the data (slow, wastes memory)
193+
worker.postMessage(audioBuffer); // Copies entire buffer
194+
```
195+
196+
### Why This Matters
197+
198+
- **60fps requires <16ms per frame** - ANY blocking kills animations
199+
- **Audio glitches at 48kHz** - Processing must complete in <20ms
200+
- **User perceives lag at 100ms** - Main thread blocking = bad UX
201+
- **The whole system locks up** - One blocking operation cascades
202+
203+
### Detection: Main Thread Violations
204+
205+
Chrome DevTools shows these warnings:
206+
```
207+
[Violation] 'requestIdleCallback' handler took 345ms
208+
[Violation] 'click' handler took 349ms
209+
[Violation] Added non-passive event listener to a scroll-blocking event
210+
```
211+
212+
**If you see these, something is wrong with the architecture.**
213+
214+
### The History (Why This Is Non-Negotiable)
215+
216+
- **2009**: Grand Central Dispatch (GCD) - Apple's answer to multicore
217+
- **2010s**: pthreads became standard in C/C++ for threading
218+
- **2013**: Web Workers standardized for browser background tasks
219+
- **2017**: AudioWorklet replaced ScriptProcessorNode (deprecated)
220+
- **Today**: EVERY professional SDK does heavy work off main thread
221+
222+
**You cannot code like it's 2005.** Modern systems require concurrent architecture.
223+
224+
---
225+
133226
## 🔌 POLYMORPHISM PATTERN (OpenCV-style)
134227

135228
**Why polymorphism over templates/generics for compute-heavy work:**

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,14 @@ This project is in **active pre-alpha development** and is **NOT ready for gener
6060
<p align="center"><em>Theme — Cyberpunk aesthetic customization</em></p>
6161
</td>
6262
</tr>
63+
<tr>
64+
<td width="50%">
65+
<img src="src/debug/jtag/docs/screenshots/livewidget-voice-call.png" alt="Voice Calls"/>
66+
<p align="center"><em>Live — Voice calls with AI personas and live transcription</em></p>
67+
</td>
68+
<td width="50%">
69+
</td>
70+
</tr>
6371
</table>
6472

6573
---
484 KB
Loading
59.7 KB
Loading
12.2 KB
Loading
549 KB
Loading
283 KB
Loading
Binary file not shown.
169 KB
Loading

0 commit comments

Comments
 (0)