Skip to content

feat: Replace custom voice pipeline with ElevenLabs Conversational AI#6

Merged
amethystani merged 1 commit intomainfrom
claude/vigilant-austin
Mar 9, 2026
Merged

feat: Replace custom voice pipeline with ElevenLabs Conversational AI#6
amethystani merged 1 commit intomainfrom
claude/vigilant-austin

Conversation

@amethystani
Copy link
Contributor

Summary

  • Eliminates 2-6 second voice agent latency by replacing the custom 3-hop pipeline (Browser Speech Recognition → Groq LLM → TTS Service) with ElevenLabs Conversational AI
  • Uses @elevenlabs/react SDK's useConversation hook which handles STT + LLM + TTS in a single real-time WebSocket connection with sub-second response times
  • Removes all custom speech recognition, AI service routing, and multi-provider TTS logic from the call interface — zero custom code, just the real ElevenLabs endpoint

What changed

  • Installed @elevenlabs/react (v0.14.1), replacing the deprecated @11labs/client (v0.0.4)
  • Rewrote UserPhoneInterface.tsx to use useConversation hook instead of the manual STT → LLM → TTS pipeline
  • Kept the existing UI (call controls, visual feedback, transcript captions, avatar animations)
  • Kept the signed URL Netlify function (/api/elevenlabs-signed-url) for secure API key handling

Before vs After

Before After
Architecture Browser Speech API → Groq LLM → TTS (Groq/ElevenLabs/DeAPI) ElevenLabs WebSocket (all-in-one)
Latency 2-6 seconds Sub-second
Custom code ~500 lines of speech recognition, AI routing, TTS orchestration ~50 lines wiring the SDK hook to UI

Test plan

  • Navigate to /user/call and start a voice call
  • Verify microphone permission prompt appears
  • Speak and confirm sub-second AI responses
  • Verify transcripts appear for both user and agent
  • Test mute/unmute speaker toggle
  • Test end call button
  • Test back button navigation
  • Verify no console errors during conversation

🤖 Generated with Claude Code

…nal AI

Eliminates 2-6 second voice agent latency by switching from a custom
3-hop pipeline (Browser Speech Recognition -> Groq LLM -> TTS Service)
to ElevenLabs Conversational AI which handles STT + LLM + TTS in a
single real-time WebSocket connection with sub-second response times.

- Install @elevenlabs/react SDK (replaces deprecated @11labs/client)
- Rewrite UserPhoneInterface to use useConversation hook
- Remove custom speech recognition, AI service, and TTS service calls
- Keep existing UI (call controls, visual feedback, transcript display)
- Signed URL flow via existing Netlify function stays unchanged

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@netlify
Copy link

netlify bot commented Mar 9, 2026

Deploy Preview for gonnaai ready!

Name Link
🔨 Latest commit 1243316
🔍 Latest deploy log https://app.netlify.com/projects/gonnaai/deploys/69aee8811d8e140008789b4d
😎 Deploy Preview https://deploy-preview-6--gonnaai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
1 paths audited
Performance: 31
Accessibility: 97
Best Practices: 100
SEO: 100
PWA: 80
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions github-actions bot added configuration dependencies Pull requests that update a dependency file size/XL frontend labels Mar 9, 2026
@amethystani amethystani merged commit fc4761e into main Mar 9, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

configuration dependencies Pull requests that update a dependency file frontend size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant