-
Notifications
You must be signed in to change notification settings - Fork 3k
[livekit-plugins-aws] No transcripts generated when End-to-End Encryption (E2EE) is enabled on the clientΒ #5231
Description
Bug Description
When End-to-End Encryption (E2EE) is enabled on the LiveKit client, the server-side STT agent receives audio frames but produces no transcripts. The agent successfully connects to the room, subscribes to the audio track, and streams audio to Amazon Transcribe (status 200), yet no interim or final transcript events are ever returned. The audio frames reaching the agent appear to be encrypted, so Amazon Transcribe receives unintelligible data and cannot detect any speech. Disabling E2EE on the same setup immediately restores normal transcription.
Expected Behavior
When E2EE is enabled, the agent SDK should either provide a mechanism for server-side agents to participate in the E2EE key exchange and decrypt audio tracks before processing, or at minimum detect that incoming audio is encrypted and surface a clear warning or error rather than silently forwarding encrypted bytes to the STT service with no output.
Reproduction Steps
1.Start a local LiveKit server with livekit-server --dev
2.then launch an STT agent using livekit-plugins-aws with python3 test_auto_lang.py dev.
3.Generate a room token and connect via https://meet.livekit.io/?tab=custom using ws://localhost:7880, making sure to enable the E2EE toggle in the client settings before joining.
4. Once connected, speak into the microphone for at least 10β15 seconds. The agent logs will show registered worker, Subscribed to audio track, and incrementing Processed X audio frames messages, confirming audio is flowing, but no [Interim], [FINAL], or [Speech started] transcript events will appear. Disconnect, disable E2EE, reconnect with a new token, and speak again β transcripts will appear immediately, confirming E2EE is the cause.Operating System
macOS Tahoe
Models Used
Amazon Transcribe
Package Versions
Package Version
Python 3.14.2
livekit-server 1.9.11
livekit-agents 1.4.4
livekit-plugins-aws 1.4.4
livekit (rtc) 1.1.2
livekit-api 1.1.0
livekit-protocol 1.1.2
aws-sdk-transcribe-streaming 0.4.0
smithy-aws-core 0Session/Room/Call IDs
No response
Proposed Solution
The LiveKit agent SDK should provide a mechanism for server-side agents to participate in the E2EE key exchange so they can decrypt audio tracks before forwarding them to external STT services like Amazon Transcribe. If full E2EE participation is not feasible due to architectural constraints, an alternative approach would be to support a "trusted agent" mode where the server provisions the shared encryption key to registered agents, allowing them to decrypt media server-side while maintaining E2EE between all other participants. At a minimum, the SDK should detect when incoming audio tracks are E2EE-encrypted and emit a clear warning log such as "Audio track is E2EE-encrypted β STT transcription will not work without decryption" rather than silently processing encrypted bytes that produce no output, which makes debugging extremely difficult.Additional Context
No response
Screenshots and Recordings
No response