Skip to content

Latest commit

 

History

History
50 lines (34 loc) · 1.17 KB

File metadata and controls

50 lines (34 loc) · 1.17 KB

mp4-mp3-transcript

Transcribe MP4 (or MKV) meeting recordings to plain text using local Whisper.

Requirements

  • Python 3.9+
  • ffmpeg (bundled in ffmpeg-master-latest-win64-gpl/ or installed on PATH)
  • ~2GB disk space for the Whisper model (first run only)

Install

pip install -r requirements.txt

Usage

# Auto-detect language
python transcribe.py meeting.mp4

# Force English
python transcribe.py meeting.mp4 --lang en

# Force Portuguese (BR)
python transcribe.py meeting.mp4 --lang pt

# Use larger model for better accuracy
python transcribe.py meeting.mp4 --lang en --model large

Output

Two files are created alongside your input:

  • meeting.txt plain text, ready to paste into AI tools
  • meeting_timestamps.txt same text with [HH:MM:SS] markers for referencing back to the recording

Models

Model Speed Accuracy RAM
tiny fastest lowest ~1GB
base fast low ~1GB
small medium medium ~2GB
medium slow good ~5GB
large slowest best ~10GB

Default is medium, best balance for meeting transcription.