Skip to content

Latest commit

 

History

History
31 lines (24 loc) · 906 Bytes

File metadata and controls

31 lines (24 loc) · 906 Bytes

Setting Up Speaker Diarization

Speaker diarization requires a HuggingFace token to download the pyannote models.

Steps:

  1. Get a HuggingFace Token:

  2. Accept the pyannote model license:

  3. Set the token:

    export HF_TOKEN="your_token_here"
  4. Re-run the transcription:

    source .venv/bin/activate
    python examples/transcribe_only.py ~/Movies/"2025-11-04 14-36-25.mp4" \
      --model tiny \
      --output-dir output

The transcription will now include speaker labels like SPEAKER_00, SPEAKER_01, etc.