Local Audio Transcription Automation (whisper.cpp + macOS Shortcuts)

This project provides a fast, reliable workflow for transcribing audio and video files locally on macOS using whisper.cpp and ffmpeg. It includes:

a hardened shell script (transcribe.sh) that handles real-world audio issues
an optional macOS Shortcut (Whisper Transcribe.shortcut) for drag-and-drop or Finder-menu automation
support for automatic format conversion, silence handling, and duplicate-line reduction

It is designed for personal workflows, research, and offline transcription without relying on cloud services.

Features

✔ Robust audio handling

Real-world files come in every format: .m4a, .mov, .wav, .mp3, .flac. The script normalizes everything to 16 kHz mono WAV, ensuring whisper.cpp receives consistent input.

✔ Silence trimming + reduced duplicate lines

Whisper’s overlapping context windows can produce repeated lines when long silences occur. To solve this, the script uses:

VAD mode
Silero VAD model
entropy threshold tuning (--entropy-thold 2.8)
zero context reuse (--max-context 0)

This combination dramatically improves transcription clarity.

✔ Automatic, non-colliding output filenames

Every transcript is saved to the Desktop with incrementing suffixes when needed.

✔ macOS Shortcut Integration

Included: shortcuts/Whisper Transcribe.shortcut

Use it from Finder, the Services menu, or as part of a larger Shortcuts automation.

Repository Structure

.
├── transcribe.sh
├── README.md
├── .gitignore
├── shortcuts/
│   ├── Whisper Transcribe.shortcut
│   └── Whisper Transcribe.png

Requirements

macOS

Install the required tools with Homebrew:

brew install ffmpeg
brew install whisper-cpp

Models

Download whisper.cpp models to:

~/.cache/whisper.cpp/models/

For example:

ggml-large-v3-turbo.bin
ggml-silero-v5.1.2.bin

The script defaults to large-v3-turbo unless overridden.

Using the macOS Shortcut

Open the .shortcut file in:
```
shortcuts/Whisper Transcribe.shortcut
```
macOS will prompt you to import it.
The Shortcut will call transcribe.sh with the selected file.
The transcript is written to your Desktop automatically.

You can trigger it via:

Finder right-click → Quick Actions
Spotlight → typing the name of the Shortcut
Incorporating it into automated workflows

Using the Script Directly

Basic usage

./transcribe.sh /path/to/audio.m4a

Override model

./transcribe.sh input.wav medium.en

Transcripts are saved as:

~/Desktop/<filename>.txt

with unique naming logic.

How It Works

Format normalization

EXT="${IN##*.}"
case "$EXT" in
  wav|flac|mp3|ogg)
    run_whisper "$IN"
    ;;
  *)
    ffmpeg -i "$IN" -ar 16000 -ac 1 -f wav - | run_whisper -
    ;;
esac

Preventing duplicate lines

--vad \
--vad-model "$VADMODEL" \
--entropy-thold 2.8 \
--max-context 0

These settings reduce Whisper’s tendency to repeat segments when long silences are present.

Why This Exists

This workflow evolved from solving real issues with whisper.cpp:

inconsistent audio formats causing errors
duplicate lines from overlapping-context handling
background noise and silence introducing artifacts
wanting a fast, local, repeatable transcription workflow
needing drag-and-drop convenience via macOS Shortcuts

The result is a reliable, automated transcription system you can run completely offline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local Audio Transcription Automation (whisper.cpp + macOS Shortcuts)

Features

✔ Robust audio handling

✔ Silence trimming + reduced duplicate lines

✔ Automatic, non-colliding output filenames

✔ macOS Shortcut Integration

Repository Structure

Requirements

macOS

Models

Using the macOS Shortcut

Using the Script Directly

Basic usage

Override model

How It Works

Format normalization

Preventing duplicate lines

Why This Exists

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
shortcuts		shortcuts
.gitignore		.gitignore
README.md		README.md
transcribe.sh		transcribe.sh

Folders and files

Latest commit

History

Repository files navigation

Local Audio Transcription Automation (whisper.cpp + macOS Shortcuts)

Features

✔ Robust audio handling

✔ Silence trimming + reduced duplicate lines

✔ Automatic, non-colliding output filenames

✔ macOS Shortcut Integration

Repository Structure

Requirements

macOS

Models

Using the macOS Shortcut

Using the Script Directly

Basic usage

Override model

How It Works

Format normalization

Preventing duplicate lines

Why This Exists

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages