scribe

State-of-the-art local audio transcription with speaker diarization for macOS.

100% local. No cloud. No API keys. No data leaves your machine.

Features

Transcription — Accurate speech-to-text powered by NVIDIA Parakeet TDT v3 (via FluidAudio CoreML)
Speaker diarization — Identify who said what, powered by pyannote (via FluidAudio CoreML)
Apple Silicon optimized — Runs on CoreML and the Apple Neural Engine at 130x real-time
Multiple output formats — Plain text, JSON (with word timestamps), SRT, VTT
25 European languages — English, Spanish, French, German, Italian, Portuguese, Russian, and more
Fast — Transcribes a 4-minute recording in under 2 seconds

Install

brew install theam/tap/scribe

Or build from source:

git clone https://github.com/theam/scribe.git
cd scribe
swift build -c release
cp .build/release/scribe /usr/local/bin/

Usage

Basic transcription

scribe transcribe meeting.wav

With speaker diarization

scribe transcribe meeting.wav --diarize

Specify number of speakers (recommended for best diarization)

scribe transcribe meeting.wav --diarize --speakers 4

Tip: Providing the expected number of speakers with --speakers significantly improves diarization accuracy. Without it, the automatic speaker count detection works well for most recordings but may slightly over- or under-segment when voices are similar. If you know how many people were in the meeting, always pass --speakers.

Output formats

scribe transcribe meeting.wav --format txt    # plain text (default)
scribe transcribe meeting.wav --format json   # structured JSON with word timestamps
scribe transcribe meeting.wav --format srt    # SRT subtitles
scribe transcribe meeting.wav --format vtt    # WebVTT subtitles

Save to file

scribe transcribe meeting.wav --format json --output transcript.json

Force a language

scribe transcribe meeting.wav --language es    # Spanish
scribe transcribe meeting.wav --language fr    # French

Pre-download models

scribe models download all    # download ASR + diarization models for offline use

Output Examples

Plain text with diarization

[00:03] Speaker 1: Hello, how are you?
[00:06] Speaker 1: I forgot a few points.
[00:27] Speaker 2: Let's see if Claude is right about you.
[00:32] Speaker 3: Oh my gosh, here comes the song. My favorite.

JSON

{
  "metadata": {
    "duration": 226.1,
    "diarization": true
  },
  "segments": [
    {
      "start": 3.2,
      "end": 4.7,
      "text": "Hello, how are you?",
      "speaker": "Speaker 1",
      "words": [
        { "start": 3.2, "end": 3.6, "text": "Hello," },
        { "start": 3.6, "end": 3.9, "text": "how" },
        { "start": 3.9, "end": 4.2, "text": "are" },
        { "start": 4.2, "end": 4.7, "text": "you?" }
      ]
    }
  ]
}

Performance

Tested on Apple Silicon (M-series):

Task	Speed	Example
Transcription only	~130x real-time	4-min file in 1.7s
Transcription + diarization	~30x real-time	4-min file in 7.5s

Models are downloaded automatically on first use (~600MB for ASR, ~50MB for diarization).

Requirements

macOS 14 (Sonoma) or later
Apple Silicon (M1 or later)

Supported Languages

Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Ukrainian.

Acknowledgments

scribe is built on the shoulders of excellent open-source projects:

NVIDIA Parakeet (CC-BY-4.0) — The speech recognition model that powers transcription
FluidAudio (Apache 2.0) by FluidInference — CoreML speech processing SDK for Apple Silicon
pyannote.audio (MIT) by Herve Bredin — The diarization model architecture
swift-argument-parser (Apache 2.0) by Apple — CLI argument parsing

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
Sources/scribe		Sources/scribe
.gitignore		.gitignore
LICENSE		LICENSE
Package.resolved		Package.resolved
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scribe

Features

Install

Usage

Basic transcription

With speaker diarization

Specify number of speakers (recommended for best diarization)

Output formats

Save to file

Force a language

Pre-download models

Output Examples

Plain text with diarization

JSON

Performance

Requirements

Supported Languages

Acknowledgments

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

scribe

Features

Install

Usage

Basic transcription

With speaker diarization

Specify number of speakers (recommended for best diarization)

Output formats

Save to file

Force a language

Pre-download models

Output Examples

Plain text with diarization

JSON

Performance

Requirements

Supported Languages

Acknowledgments

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages