-
Notifications
You must be signed in to change notification settings - Fork 82
Description
I'm using the Deepgram Voice Agent API in my production application (ANARA, an AI personal assistant). The connection establishes successfully, settings are applied, and audio frames arrive correctly—but all output plays as full-spectrum static instead of intelligible speech.
Environment
- Browser: Chrome (latest stable) on Windows 11
- SDK: @deepgram/sdk (latest)
- Integration: WebSocket to
wss://agent.deepgram.com/v1/agent/conversewith token subprotocol auth - Voice Model:
aura-2-helena-en - Audio Config: linear16, 16 kHz, container "none"
Steps to Reproduce
- Open browser developer console
- Connect WebSocket to Voice Agent endpoint using token subprotocol auth
- Send Settings payload with configuration below (see attached voice-agent-config.json)
- Send microphone audio stream as Int16 PCM @ 16 kHz
- Receive binary frames and play via Web Audio API (AudioContext, copyToChannel, BufferSource)
Expected: Helena voice responds clearly
Actual: All playback is broadband static/noise
Key Evidence
- Request ID:
fb4f1cce-d56e-49fd-9fd1-941c9f5e916b - Control Test Result: Ran Deepgram
/v1/speakendpoint with the same Helena voice → WAV output is crystal clear (SHA256:D26DB80EED6C06E10D4D44BD15451926DBB7349382F000B70DA7510193BF731C) - Diagnostics Collected:
- AudioContext sample rate (48 kHz, resampled from requested 16 kHz)
- First 32 bytes of binary payload logged
- Playback chain includes DC-block high-pass filter @ 20 Hz
- No decoding errors or exceptions on the client side
- All PCM frames decode without throwing, no NaN/Infinity samples
Root Cause Analysis
The static is isolated to the Voice Agent streaming pipeline, not:
- The Helena TTS model itself (proven by clean
/v1/speakoutput) - Web Audio API playback (no errors, correct chain)
- Browser configuration (sample rate resampling handled)
- Network or frame delivery (all frames arrive on schedule)
What I Need
- Internal trace for request
fb4f1cce-d56e-49fd-9fd1-941c9f5e916b - Confirmation that Voice Agent + Helena voice configuration is valid
- Guidance on next debugging steps or notification if this is a known issue
Urgency
This is blocking production launch. Any guidance would be greatly appreciated.
Attachments
Please attach these files when posting this issue:
voice-agent-session-log.txt– timeline, environment, session detailsbrowser-audio-diagnostics.txt– AudioContext stats, console logs, playback observationsvoice-agent-config.json– exact Settings payload sent (credentials removed)tts-control-test.txt– proof that TTS endpoint outputs clean audioclient-instrumentation-snippet.txt– code snippet showing how diagnostics were captured (no proprietary logic)helena.wav– reference clean TTS audio for comparison
================================================================================
FILES TO ATTACH (all located in support/deepgram/ folder):
✓ voice-agent-session-log.txt
✓ browser-audio-diagnostics.txt
✓ voice-agent-config.json
✓ tts-control-test.txt
✓ client-instrumentation-snippet.txt
✓ helena.wav (from repo root)
================================================================================
client-instrumentation-snippet.txt