Subtitle and Transcript Generation via Deepgram Nova-3
I built this tool to solve a persistent problem in my media library: hundreds of episodes missing subtitles. While Bazarr does an excellent job finding subtitles for most content, there are always gaps like obscure shows, older episodes, or content that doesn't have community-contributed subtitles available.
I looked around for options with free trials but most only gave a couple hours free and then required subscription. Deepgram's $200 free signup credit offer was the best deal I could find. Their Nova-3 model produces high-quality transcriptions at ~$0.004/minute, and adding keyterms—character names, locations, and show-specific terminology—dramatically improves accuracy for proper nouns that would otherwise be misrecognized. This creates subtitles that fill the gaps in your library without requiring intensive manual correction. It's not perfect, but it's very useful for jargon heavy dialogue.
Subgeneratorr is for media enthusiasts who care about complete subtitle coverage, accessibility, and having a polished library experience in Plex, Jellyfin, or Emby.
Disclaimer: This is a free and open-source project. Not affiliated with Deepgram, Anthropic, OpenAI, or any other service providers.
- 🎯 Nova-3 Transcription - Deepgram's flagship model with General and Medical variants
- 🔑 LLM-Enhanced Keyterms - AI-powered generation of character names and terminology (optional)
- 🗣️ Speaker Diarization - Identify speakers and create labeled transcripts
- 🌍 Multi-Language Support - 50+ languages with auto-detect, multilingual code-switching, and regional variants
- 🛡️ Content Control - Redaction (PCI/PII/numbers), profanity filtering, find & replace, dictation mode
- 🧠 Audio Intelligence - Sentiment analysis, summarization, topic/intent/entity detection, term search (English)
- 🐳 Docker-Based - Easy deployment with CLI and optional Web UI
- 📁 Flexible Processing - Batch process directories, specific files, or from lists
- 💰 Cost Tracking - Real-time estimates and detailed logs (~$0.0043/min)
- ⚡ Smart Skipping - Skip files that already have subtitles
- 📺 Media Server Ready - Auto-recognized by Plex, Jellyfin, Emby (
.eng.srtformat)
50+ languages with regional variants — English, Spanish, French, German, Japanese, Korean, Hindi, Russian, Portuguese, Arabic, and many more. Includes automatic language detection, multilingual code-switching, and keyterm prompting across all supported languages.
See the full language list and configuration guide for all supported languages and regional variants.
- Docker and Docker Compose (Linux | macOS | Windows)
- A Deepgram API key (Get $200 free credits)
- Media files (MKV, MP4, AVI, MOV, MP3, WAV, FLAC, etc.)
# Clone the repository
git clone https://github.com/tylerbcrawford/subgeneratorr.git
cd subgeneratorr
# Configure environment
cp .env.example .env
cp examples/docker-compose.example.yml docker-compose.yml
# Edit .env — set these two required values:
# DEEPGRAM_API_KEY=your_key_here
# MEDIA_PATH=/path/to/your/media
# Build and start
docker compose build
# Start Web UI
docker compose up -d
# Open http://localhost:5000
# OR run CLI directly
docker compose run --profile cli --rm cliProcess entire media library (CLI):
docker compose run --profile cli --rm cliProcess specific show/season:
docker compose run --profile cli --rm -e MEDIA_PATH=/media/tv/ShowName/Season\ 01 cliStart the Web UI:
docker compose up -d
# Open http://localhost:5000The Web UI provides a browser-based interface for remote management, batch processing, and AI-powered keyterm generation.
docker compose up -dAccess at http://localhost:5000 (or configure reverse proxy for remote access)
- 🌐 Remote access from any device
- 📊 Real-time progress tracking with per-file status
- 🤖 AI Keyterm Generation with Claude, GPT, or Gemini (optional)
- 📁 Directory browser with search and file filtering
- 🔄 Bazarr integration for automatic subtitle rescans
- ⚡ Batch processing with parallel workers
- ⚙️ Full Nova-3 feature control — model selection, redaction, dictation, multichannel, Audio Intelligence, and more via collapsible Transcription Settings panel
Improve transcription accuracy by up to 90% for important terms like character names, locations, and show-specific jargon.
Create keyterms CSV:
# For TV shows (at show level)
/media/tv/Breaking Bad/Transcripts/Keyterms/Breaking Bad_keyterms.csv
# Format: one term per line
Walter White
Jesse Pinkman
Heisenberg
Los Pollos HermanosOr use AI generation (Web UI only):
- Select video → Click "Generate Keyterms"
- Supports Claude, GPT, and Gemini
- Costs ~$0.002-0.08 per generation depending on model
Replace generic "Speaker 0", "Speaker 1" labels with character names in transcripts.
Create speaker map:
# At show level
/media/tv/Breaking Bad/Transcripts/Speakermap/speakers.csv
# CSV format
speaker_id,name
0,Walter White
1,Jesse PinkmanAuto-detected when you enable transcript generation (ENABLE_TRANSCRIPT=1)
Key settings in .env:
| Variable | Description | Default |
|---|---|---|
DEEPGRAM_API_KEY |
Deepgram API key (required) | - |
MEDIA_PATH |
Media directory to scan | /media |
LANGUAGE |
Language code (see supported languages above) | en |
ENABLE_TRANSCRIPT |
Generate speaker-labeled transcripts | 0 |
FORCE_REGENERATE |
Regenerate existing subtitles | 0 |
PROFANITY_FILTER |
Filter mode: off, tag, or remove |
off |
ANTHROPIC_API_KEY |
For AI keyterm generation (optional) | - |
OPENAI_API_KEY |
For AI keyterm generation (optional) | - |
GEMINI_API_KEY |
For AI keyterm generation (optional) | - |
Set MEDIA_PATH in your .env file to point to your media library:
| Platform | Example |
|---|---|
| Linux | MEDIA_PATH=/home/username/media |
| macOS | MEDIA_PATH=/Users/username/Movies |
| Windows | MEDIA_PATH=C:/Users/YourName/Videos |
Deepgram Nova-3 charges ~$0.0043 per minute of audio:
- 10-minute TV episode: ~$0.04
- 45-minute episode: ~$0.19
- 90-minute movie: ~$0.39
- 100 episodes (10 min each): ~$4.30
New users get $200 in free credits - enough for ~46,000 minutes (~767 hours) of transcription.
- Technical Documentation - Architecture, API endpoints, advanced configuration
- Language Support Guide - Complete language matrix and multilingual features
- Project Roadmap - Future features and development plans
Generated .eng.srt files are automatically recognized by:
- Plex - Shows as "English (SRT External)"
- Jellyfin - Auto-detected with proper language tags
- Emby - Supports ISO-639-2 language codes
- Bazarr - Use as fallback when online subtitles unavailable
After generation, refresh your media library to detect new subtitles.
- Let Bazarr find subtitles for most content
- Run Subgeneratorr on your media directory (skips files with subtitles)
- Only processes files missing subtitles
- Refresh Plex/Jellyfin library
- Download new season via Sonarr/Radarr
- Run:
docker compose run --profile cli --rm -e MEDIA_PATH=/media/tv/ShowName/Season\ 01 cli - Subtitles generated automatically
- Bazarr rescan triggers (if Web UI integration enabled)
- Create keyterms CSV with character names
- Create speaker map CSV
- Run with transcripts enabled:
docker compose run --profile cli --rm -e ENABLE_TRANSCRIPT=1 cli - Get both
.eng.srtsubtitles and.transcript.speakers.txtfiles
Files are skipped if .eng.srt already exists. Use FORCE_REGENERATE=1 to reprocess.
Set PUID and PGID in docker-compose.yml to match your user:
id -u # Get your UID
id -g # Get your GID- Verify API key in
.env - Check account balance at Deepgram Console
- Ensure sufficient credits
- Check file location:
{Show}/Transcripts/Keyterms/{ShowName}_keyterms.csv - Verify UTF-8 encoding
- Ensure filename matches show directory name exactly
Contributions welcome! Please feel free to submit issues or pull requests.
See CONTRIBUTING.md for guidelines.
This project is licensed under the MIT License - see LICENSE for details.
- Deepgram - AI-powered speech recognition API
- Built with Deepgram Python SDK
- Uses deepgram-captions for SRT generation

