Skip to content

Releases: ringger/transcribe-critic

v1.2.0

26 Feb 06:58

Choose a tag to compare

What's New

  • Default diarization: Speaker diarization is now on by default — use --no-diarize to skip
  • Transcript summarization: New pipeline step generates a structured summary.md (overview, key points, speakers, notable quotes) after markdown generation
  • Separate summary LLM backend: Summarization can use a different model and backend than adjudication via --summary-model, --summary-api, and --summary-api-key — e.g., use local Ollama for adjudication and Claude Opus for summaries
  • Speaker-aware summaries: Summarization prefers the diarized transcript when available, so speaker names appear in the summary
  • Total: 575 tests

New CLI flags

Flag Description
--no-diarize Disable speaker diarization (was opt-in with --diarize, now on by default)
--no-summarize Skip transcript summarization
--summary-model MODEL Use a different model for summaries
--summary-api Use Anthropic API for summaries even if main LLM is local
--summary-api-key KEY Separate API key for summarization

Pipeline

download → transcribe → ensemble → diarize → slides → merge → markdown → **summarize** → analysis

Full Changelog: v1.1.1...v1.2.0

v1.1.1

26 Feb 03:33

Choose a tag to compare

What's New

  • Diarization bug fix: Fixed transcript segments not being reloaded after ensemble and in _hydrate_data(), which caused diarization to silently skip when using --steps
  • --title CLI flag: Override the metadata title for cleaner output directory naming (useful for direct MP3 URLs)
  • Pipeline transition tests: 11 new integration tests verifying data flow between pipeline stages (ensemble→diarize, hydrate, merge→markdown, slides→markdown)
  • Total: 557 tests, 80% coverage

Full Changelog: v1.1.0...v1.1.1

v1.1.0

26 Feb 00:13

Choose a tag to compare

What's New

  • Targeted diff adjudication: New ensemble strategy uses wdiff-based diff resolution instead of full-text LLM merging, surgically resolving only disagreements between Whisper models
  • 3-way ensemble default: Ships with small + medium + distil-large-v3 for better coverage
  • Anti-hallucination flags: Whisper runs include repetition detection and suppression
  • DRY refactoring: Centralized constants, named pipeline stages, consistent skip logic
  • Test coverage 60% → 80%: 81 new tests (546 total) covering transcriber, transcription, slides, and eval subpackage
  • Eval harness: transcribe-critic-eval CLI for dataset prep, pipeline runs, and scoring with meeteval

v1.0

21 Feb 07:13

Choose a tag to compare

Initial public release.

  • Multi-model Whisper ensembling with LLM-based adjudication
  • Critical text merging from 2–3+ transcript sources (Whisper, YouTube captions, external transcripts)
  • Blind/anonymous source presentation to prevent provenance bias
  • wdiff-based alignment for accurate chunking across sources of different lengths
  • Structured transcript preservation (speaker labels, timestamps)
  • Slide extraction and optional vision API analysis
  • Make-style DAG pipeline with checkpoint resumption
  • Cost estimation and local-only mode