Skip to content

Use faster-whisper for transcription and update transcription dependencies for Python 3.11 compatibility#124

Open
SoldierSacha wants to merge 17 commits intoManimCommunity:mainfrom
You-Learn-Org:main
Open

Use faster-whisper for transcription and update transcription dependencies for Python 3.11 compatibility#124
SoldierSacha wants to merge 17 commits intoManimCommunity:mainfrom
You-Learn-Org:main

Conversation

@SoldierSacha
Copy link

Summary

This PR improves the transcription workflow by migrating to faster-whisper .

This migration also happens to resolve an installation issue on Python 3.11 caused by outdated openai-whisper and llvmlite version requirements.

Background / Motivation

The current transcribe extra depends on:

  • openai-whisper ^20230314

This version of whisper pulls in old numba/llvmlite versions that only support Python < 3.10. As a result, manim-voiceover[transcribe] fails to install on Python 3.11+ (especially on macOS and CI/Linux environments). This personally caused me repeated dependency resolution and build failures. So I decided to fix it myself.

Benefits

  • Restores compatibility with Python 3.11+.
  • Reduces dependency conflicts (Torch / Triton / llvmlite).
  • Provides a lighter-weight transcription option (which is also significantly faster)
  • Improves installation reliability on macOS and GitHub Actions

Notes:

  • Existing non-transcription features remain unaffected.
  • There are no breaking changes here.

SoldierSacha and others added 7 commits January 27, 2026 00:58
- Replace openai-whisper and stable-ts dependencies with faster-whisper
- Update transcription implementation in base.py to use faster-whisper API
- Update timestamps_to_word_boundaries to handle faster-whisper segments
- Update documentation references to point to faster-whisper repository
- Update error messages and docstrings

faster-whisper provides significant performance improvements through
CTranslate2 optimization while maintaining compatibility with OpenAI
Whisper models.

https://claude.ai/code/session_01Paz1JXifQu8F3npTJpx2TT
…r-AFaND

Replace OpenAI Whisper with faster-whisper for transcription
SoldierSacha and others added 10 commits January 27, 2026 01:51
Allow passing the OpenAI API key directly to the constructor instead of
requiring it to be set via environment variable. Falls back to OPENAI_API_KEY
env var if not provided.

https://claude.ai/code/session_01WkxRgXzowc15j1FFFmFP2r
…YPY0L

Allow OpenAI API key to be passed as constructor parameter
Add validation in OpenAIService.generate_from_text() to check if the
input text is empty after removing bookmarks. This prevents sending
empty strings to OpenAI's TTS API, which returns a 400 error with
'loc': ('body', 'input').

The empty string issue can occur when:
- User passes whitespace-only text
- User passes text containing only bookmark tags

https://claude.ai/code/session_01LH2yMANh1zTUAdXv9hWeu4
…0K01j

Add validation for empty text after bookmark removal in OpenAI service
- Add transcription_model_kwargs parameter to SpeechService.__init__()
  to allow passing kwargs (like compute_type) to WhisperModel
- Add model_kwargs parameter to set_transcription() method
- Suppress ctranslate2 warning logs by default

https://claude.ai/code/session_011zmzHd9JQbjQ8n1vyKkDoH
…zvJE

Add support for WhisperModel constructor kwargs in transcription
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants