Skip to content

Releases: m-bain/whisperX

v3.8.2

10 Mar 14:47

Choose a tag to compare

What's Changed

  • feat: expose avg_logprob per segment from ctranslate2 beam search by @Barabazs in #1350
  • fix: revert #986 wildcard alignment that broke word-level timestamps (#1220) by @Barabazs in #1367

Full Changelog: v3.8.1...v3.8.2

v3.7.8

10 Mar 14:59

Choose a tag to compare

Backport of word-level timestamp fixes from v3.8.2.

Bug Fixes

  • Restore original CTC forced-alignment (f2609a6): PR #986 caused all words to anchor to the start of the segment window (silence) instead of actual speech. Reverts get_trellis/backtrack to the original PyTorch tutorial implementation. Fixes #1220.
  • Fix blank_id hardcoded to 0 (636f298): Broke alignment for HuggingFace models where blank is [pad], not index 0.

Full Changelog: v3.7.7...v3.7.8

v3.6.1

10 Mar 15:04

Choose a tag to compare

Backport of word-level timestamp fixes from v3.8.2.

Bug Fixes

  • Restore original CTC forced-alignment (f2609a6): PR #986 caused all words to anchor to the start of the segment window (silence) instead of actual speech. Reverts get_trellis/backtrack to the original PyTorch tutorial implementation. Fixes #1220.
  • Fix blank_id hardcoded to 0 (636f298): Broke alignment for HuggingFace models where blank is [pad], not index 0.

Full Changelog: v3.6.0...v3.6.1

v3.5.1

10 Mar 15:06

Choose a tag to compare

Backport of word-level timestamp fixes from v3.8.2.

Bug Fixes

  • Restore original CTC forced-alignment (f2609a6): PR #986 caused all words to anchor to the start of the segment window (silence) instead of actual speech. Reverts get_trellis/backtrack to the original PyTorch tutorial implementation. Fixes #1220.
  • Fix blank_id hardcoded to 0 (636f298): Broke alignment for HuggingFace models where blank is [pad], not index 0.

Full Changelog: v3.5.0...v3.5.1

v3.4.4

10 Mar 15:09

Choose a tag to compare

Backport of word-level timestamp fixes from v3.8.2.

Bug Fixes

  • Restore original CTC forced-alignment (f2609a6): PR #986 caused all words to anchor to the start of the segment window (silence) instead of actual speech. Reverts get_trellis/backtrack to the original PyTorch tutorial implementation. Fixes #1220.
  • Fix blank_id hardcoded to 0 (636f298): Broke alignment for HuggingFace models where blank is [pad], not index 0.

Full Changelog: v3.4.3...v3.4.4

v3.3.5

10 Mar 15:10

Choose a tag to compare

Backport of word-level timestamp fixes from v3.8.2.

Bug Fixes

  • Restore original CTC forced-alignment (f2609a6): PR #986 caused all words to anchor to the start of the segment window (silence) instead of actual speech. Reverts get_trellis/backtrack to the original PyTorch tutorial implementation. Fixes #1220.
  • Fix blank_id hardcoded to 0 (636f298): Broke alignment for HuggingFace models where blank is [pad], not index 0.

Full Changelog: v3.3.4...v3.3.5

v3.8.1

14 Feb 14:01

Choose a tag to compare

What's Changed

  • Fix: Respect --model_dir and --model_cache_only during alignment by @MrPrayer in #1285
  • feat: forward --hf_token to WhisperModel for gated/private model support by @Barabazs in #1351

New Contributors

Full Changelog: v3.8.0...v3.8.1

v3.8.0

13 Feb 20:53
6187d25

Choose a tag to compare

What's Changed

  • feat: migrate to pyannote-audio v4 with speaker-diarization-community-1 by @Barabazs in #1349

Special thanks to @borgoat for taking the lead.

Full Changelog: v3.7.7...v3.8.0

v3.7.7

13 Feb 11:55
c4c1242

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v3.7.6...v3.7.7

v3.7.6

27 Jan 09:42
6ec4a02

Choose a tag to compare

What's Changed

Full Changelog: v3.7.5...v3.7.6