As an admin, I can configure the diarization clustering threshold so that speaker merging is reduced

## User Story

As an administrator, I can configure the diarization clustering threshold via environment variable so that the system can be tuned to favor splitting over merging speakers.

**Spec:** `docs/specs/transcription-quality-improvements.md` — US-5

**Size:** L

## Acceptance Criteria

- `DIARIZATION_THRESHOLD` env var controls the clustering sensitivity
- Default value (`0.715`) preserves current diarization behavior
- Lower values reduce speaker merging (at the cost of potential over-splitting)
- Given the threshold is set too low, then speakers may be over-split — this is documented with a recommended range (0.4–0.8)
- Requires replacing WhisperX's built-in diarization wrapper with direct pyannote Pipeline usage

## Truths

- T1: This is an admin-only setting (env var), not exposed in the upload form
- T2: This story involves the largest refactor — replacing the WhisperX diarization wrapper with direct pyannote pipeline calls
- T3: All existing diarization functionality must continue to work identically at the default threshold

## Business Rules

None directly. Addresses Goal #2 (reduce speaker merging errors).

## Edge Cases

- `DIARIZATION_THRESHOLD` set too low → over-splitting (one speaker appears as multiple). Documented with recommended range (0.4–0.8).
- Default threshold (`0.715`) → behavior identical to current

## Dependencies

None strictly, but recommended to implement last due to refactor scope (replaces WhisperX diarization wrapper).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

As an admin, I can configure the diarization clustering threshold so that speaker merging is reduced #14

User Story

Acceptance Criteria

Truths

Business Rules

Edge Cases

Dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

As an admin, I can configure the diarization clustering threshold so that speaker merging is reduced #14

Description

User Story

Acceptance Criteria

Truths

Business Rules

Edge Cases

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions