A command-line tool that automatically transcribes videos and generates subtitles using OpenAI's Whisper. It can transcribe audio in various languages and translate it to English.
- Python > 3.7 < 3.13
- FFmpeg installed on your system
uvpackage installer (recommended) orpip
- Ubuntu/Debian:
sudo apt install ffmpeg - macOS:
brew install ffmpeg
curl -LsSf https://astral.sh/uv/install.sh | sh- Clone this repository or download the files
- Create and activate a virtual environment:
# Create virtual environment
uv venv --python 3.12
# Or
python -m venv .venv
# Activate virtual environment
source .venv/bin/activate- Install dependencies using uv (recommended) or pip:
# Using uv (faster)
uv pip install -r requirements.txt
# Or using pip
pip install -r requirements.txtBasic usage:
python main.py input_video.mp4Options:
python main.py input_video.mp4 [options]
Options:
-o, --output OUTPUT Path to the output video file (default: input_video_with_subs.mp4)
-m, --model MODEL Whisper model size: small, medium, large (default: medium)
-l, --language LANG Source language code (optional, will auto-detect if not specified)
-t, --temp-dir DIR Directory for temporary files (default: temp)Examples:
# Specify output file
python main.py input_video.mp4 -o output.mp4
# Use a larger model for better accuracy
python main.py input_video.mp4 -m large
# Process French audio
python main.py input_video.mp4 -l fr
# Specify custom temp directory
python main.py input_video.mp4 -t /path/to/tempAvailable Whisper models:
small: Good balance of speed and accuracymedium: Better accuracy, slowerlarge: Best accuracy, slowest