fix(inference): set STT capabilities.diarization from extra_kwargs by russellmartin-livekit · Pull Request #5283 · livekit/agents

russellmartin-livekit · 2026-03-30T23:29:32Z

The inference STT capabilities.diarization was hardcoded to False, which caused MultiSpeakerAdapter to not work since it checks capabilities.diarization before enabling diarization.

This change:

Adds diarize option to DeepgramOptions TypedDict
Adds speaker_labels option to AssemblyaiOptions TypedDict
Detects diarization params in extra_kwargs and sets capabilities
Updates capabilities when update_options() is called with diarization
Adds comprehensive tests for diarization capability detection

Fixes AGT-2608

Slack thread: https://live-kit.slack.com/archives/C06TN33TV44/p1772573869144129?thread_ts=1771977322.899519&cid=C06TN33TV44

https://claude.ai/code/session_01VRKQuBXiq8BHKr9AiJ6uEw

The inference STT capabilities.diarization was hardcoded to False, which caused MultiSpeakerAdapter to not work since it checks capabilities.diarization before enabling diarization. This change: - Adds diarize option to DeepgramOptions TypedDict - Adds speaker_labels option to AssemblyaiOptions TypedDict - Detects diarization params in extra_kwargs and sets capabilities - Updates capabilities when update_options() is called with diarization - Adds comprehensive tests for diarization capability detection Fixes AGT-2608 Slack thread: https://live-kit.slack.com/archives/C06TN33TV44/p1772573869144129?thread_ts=1771977322.899519&cid=C06TN33TV44 https://claude.ai/code/session_01VRKQuBXiq8BHKr9AiJ6uEw

CLAassistant · 2026-03-30T23:29:41Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

devin-ai-integration

Devin Review found 1 potential issue.

View 3 additional findings in Devin Review.

devin-ai-integration · 2026-03-30T23:32:22Z

livekit-agents/livekit/agents/inference/stt.py

🔴 Diarization capability declared but speaker_id never populated in transcripts

The PR sets capabilities.diarization = True when diarize or speaker_labels is in extra_kwargs, but _process_transcript (line 665-681) never extracts speaker_id from the server response data and never passes it to SpeechData. The speaker_id field defaults to None.

This breaks MultiSpeakerAdapter, which checks stt.capabilities.diarization at livekit-agents/livekit/agents/stt/multi_speaker_adapter.py:47 and will accept this STT instance. However, when processing events, _PrimarySpeakerDetector.on_stt_event at livekit-agents/livekit/agents/stt/multi_speaker_adapter.py:244 checks if sd.speaker_id is None and short-circuits, so speaker detection/suppression never works. Compare with the Deepgram plugin at livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/stt.py:742 which correctly populates speaker_id=f"S{speaker}" from the response.

(Refers to lines 665-681)

Prompt for agents

In livekit-agents/livekit/agents/inference/stt.py, the _process_transcript method (line 651-681) needs to extract speaker_id from the server response data and pass it to the SpeechData constructor. The exact field name in the server response depends on the gateway's response format (likely "speaker" or "speaker_id" in the data dict, or possibly in individual word entries similar to how Deepgram returns it in word["speaker"]). Add speaker_id extraction logic similar to what livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/stt.py:730-734 does, and pass it as the speaker_id parameter to stt.SpeechData() at line 665. For example, extract speaker = data.get("speaker") or derive it from the words list, then set speaker_id=f"S{speaker}" if speaker is not None else None.

Was this helpful? React with 👍 or 👎 to provide feedback.

russellmartin-livekit requested review from a team, adrian-cowham and theomonnom March 30, 2026 23:29

russellmartin-livekit self-assigned this Mar 30, 2026

devin-ai-integration bot reviewed Mar 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(inference): set STT capabilities.diarization from extra_kwargs#5283

fix(inference): set STT capabilities.diarization from extra_kwargs#5283
russellmartin-livekit wants to merge 1 commit intomainfrom
claude/slack-support-diarization-stt-providers-cWpcE

russellmartin-livekit commented Mar 30, 2026

Uh oh!

CLAassistant commented Mar 30, 2026

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

russellmartin-livekit commented Mar 30, 2026

Uh oh!

CLAassistant commented Mar 30, 2026

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants