-
Notifications
You must be signed in to change notification settings - Fork 0
As an admin, I can configure the diarization clustering threshold so that speaker merging is reduced #14
Copy link
Copy link
Open
Labels
featureNew feature or enhancementNew feature or enhancement
Description
User Story
As an administrator, I can configure the diarization clustering threshold via environment variable so that the system can be tuned to favor splitting over merging speakers.
Spec: docs/specs/transcription-quality-improvements.md — US-5
Size: L
Acceptance Criteria
DIARIZATION_THRESHOLDenv var controls the clustering sensitivity- Default value (
0.715) preserves current diarization behavior - Lower values reduce speaker merging (at the cost of potential over-splitting)
- Given the threshold is set too low, then speakers may be over-split — this is documented with a recommended range (0.4–0.8)
- Requires replacing WhisperX's built-in diarization wrapper with direct pyannote Pipeline usage
Truths
- T1: This is an admin-only setting (env var), not exposed in the upload form
- T2: This story involves the largest refactor — replacing the WhisperX diarization wrapper with direct pyannote pipeline calls
- T3: All existing diarization functionality must continue to work identically at the default threshold
Business Rules
None directly. Addresses Goal #2 (reduce speaker merging errors).
Edge Cases
DIARIZATION_THRESHOLDset too low → over-splitting (one speaker appears as multiple). Documented with recommended range (0.4–0.8).- Default threshold (
0.715) → behavior identical to current
Dependencies
None strictly, but recommended to implement last due to refactor scope (replaces WhisperX diarization wrapper).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
featureNew feature or enhancementNew feature or enhancement