Releases: ringger/transcribe-critic
Releases · ringger/transcribe-critic
v1.2.0
What's New
- Default diarization: Speaker diarization is now on by default — use
--no-diarizeto skip - Transcript summarization: New pipeline step generates a structured
summary.md(overview, key points, speakers, notable quotes) after markdown generation - Separate summary LLM backend: Summarization can use a different model and backend than adjudication via
--summary-model,--summary-api, and--summary-api-key— e.g., use local Ollama for adjudication and Claude Opus for summaries - Speaker-aware summaries: Summarization prefers the diarized transcript when available, so speaker names appear in the summary
- Total: 575 tests
New CLI flags
| Flag | Description |
|---|---|
--no-diarize |
Disable speaker diarization (was opt-in with --diarize, now on by default) |
--no-summarize |
Skip transcript summarization |
--summary-model MODEL |
Use a different model for summaries |
--summary-api |
Use Anthropic API for summaries even if main LLM is local |
--summary-api-key KEY |
Separate API key for summarization |
Pipeline
download → transcribe → ensemble → diarize → slides → merge → markdown → **summarize** → analysis
Full Changelog: v1.1.1...v1.2.0
v1.1.1
What's New
- Diarization bug fix: Fixed transcript segments not being reloaded after ensemble and in
_hydrate_data(), which caused diarization to silently skip when using--steps --titleCLI flag: Override the metadata title for cleaner output directory naming (useful for direct MP3 URLs)- Pipeline transition tests: 11 new integration tests verifying data flow between pipeline stages (ensemble→diarize, hydrate, merge→markdown, slides→markdown)
- Total: 557 tests, 80% coverage
Full Changelog: v1.1.0...v1.1.1
v1.1.0
What's New
- Targeted diff adjudication: New ensemble strategy uses wdiff-based diff resolution instead of full-text LLM merging, surgically resolving only disagreements between Whisper models
- 3-way ensemble default: Ships with small + medium + distil-large-v3 for better coverage
- Anti-hallucination flags: Whisper runs include repetition detection and suppression
- DRY refactoring: Centralized constants, named pipeline stages, consistent skip logic
- Test coverage 60% → 80%: 81 new tests (546 total) covering transcriber, transcription, slides, and eval subpackage
- Eval harness:
transcribe-critic-evalCLI for dataset prep, pipeline runs, and scoring with meeteval
v1.0
Initial public release.
- Multi-model Whisper ensembling with LLM-based adjudication
- Critical text merging from 2–3+ transcript sources (Whisper, YouTube captions, external transcripts)
- Blind/anonymous source presentation to prevent provenance bias
- wdiff-based alignment for accurate chunking across sources of different lengths
- Structured transcript preservation (speaker labels, timestamps)
- Slide extraction and optional vision API analysis
- Make-style DAG pipeline with checkpoint resumption
- Cost estimation and local-only mode