ausum - Audio Summarization

Automatically transcribe and summarize any audio or video file using local AI (FluidAudio Parakeet) + Claude. Works with YouTube videos, podcasts, recordings, meetings, lectures - any audio content.

This is a tool built for macOS.

Why?

Reading is faster than watching videos. For certain types of videos I find it faster to read a detailed summary versus watching the video at a faster speed.

Features

Local speech-to-text using FluidAudio's Parakeet model (600M parameters, 25 European languages)
Automatic summarization with Claude (falls back to pi if claude is unavailable or not logged in)
Privacy-first - all transcription runs locally on your Mac
Simple CLI - one command to get transcript + summary

Prerequisites

Install required tools:

# Package managers (one-time setup)
brew install yt-dlp ffmpeg

# Claude CLI (recommended)
# Follow: https://docs.anthropic.com/claude-cli

# OR pi (used as automatic fallback if claude is unavailable or not logged in)
# Follow: https://github.com/mariozechner/pi

# FluidAudio (build from source)
git clone https://github.com/FluidInference/FluidAudio.git
cd FluidAudio
swift build -c release

Set environment variable:

# Add to ~/.zshrc or ~/.bashrc
export FLUIDAUDIO_PATH=~/path/to/FluidAudio

Installation

# Clone this repo
git clone https://github.com/roybotbot/ausum.git
cd ausum

# Install with pip
pip install .

# Or with pipx (recommended)
pipx install .

Usage

# YouTube videos
ausum "https://www.youtube.com/watch?v=VIDEO_ID"

# YouTube videos with playlist in URL (only processes the single video)
ausum "https://www.youtube.com/watch?v=VIDEO_ID&list=PLAYLIST_ID"

# Local audio/video files
ausum /path/to/video.mp4
ausum ~/Downloads/podcast.mp3
ausum ./recording.wav

# Override saved directory for a single run
ausum "https://www.youtube.com/watch?v=VIDEO_ID" -d ~/my-transcripts

# Open summary in mdv after creation
ausum "https://www.youtube.com/watch?v=VIDEO_ID" --read

Supported formats: Any audio or video format that ffmpeg can read (mp4, mp3, wav, m4a, webm, mkv, avi, flac, ogg, etc.)

Output files:

<video-title>.txt or <filename>.txt - Full transcript
<video-title>-summary.md or <filename>-summary.md - Structured summary

First Run

On your first run, ausum will:

Ask where summaries should be saved (defaults to ~/Documents if it exists)
Ask where transcripts should be saved (press Enter to use the same directory as summaries)
Ask whether to save transcript .txt files at all
Save preferences to ~/.config/ausum/config.json
Download the Parakeet model (~600MB) from HuggingFace on first transcription

Subsequent runs use your saved preferences. You can always override the output directory for a single run with -d.

Configuration

Preferences are stored in ~/.config/ausum/config.json. You can edit it directly to change settings without re-running the setup prompt:

{
  "summary_dir": "/path/to/summaries",
  "transcript_dir": "/path/to/transcripts",
  "save_transcript": true
}

summary_dir — where .md summary files are saved
transcript_dir — where .txt transcript files are saved (optional; if omitted, uses summary_dir)
save_transcript — set to false to skip saving the raw transcript

Model Storage

The Parakeet model (~460MB) is cached in ~/Library/Application Support/FluidAudio/Models/ and persists across ausum updates. It is NOT deleted when you reinstall ausum with pipx - the cache is managed by FluidAudio, not ausum.

If you need to free up disk space, you can manually delete the cache:

rm -rf ~/Library/Application\ Support/FluidAudio/Models/parakeet*

The model will be re-downloaded on next use.

Summary Format

Summaries follow the structure defined in transcript-summary.md:

Major sections with short headers
Concise bullet points of key points
Step-by-step instructions (if applicable)
Next steps for learning more

License

MIT - See LICENSE file

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
docs/plans		docs/plans
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
ausum.py		ausum.py
pyproject.toml		pyproject.toml
transcript-summary.md		transcript-summary.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ausum - Audio Summarization

Why?

Features

Prerequisites

Installation

Usage

First Run

Configuration

Model Storage

Summary Format

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ausum - Audio Summarization

Why?

Features

Prerequisites

Installation

Usage

First Run

Configuration

Model Storage

Summary Format

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages