Skip to content

rundax/parakeet-asr

Repository files navigation

NVIDIA Parakeet TDT 0.6B v3 - Local ASR Service

Local speech recognition service based on nvidia/parakeet-tdt-0.6b-v3 with an OpenAI-compatible transcription API.

Why this repo

  • Local-first ASR (no cloud required)
  • OpenAI Whisper-compatible endpoint (/v1/audio/transcriptions)
  • CPU by default, GPU optional on Linux
  • Works on:
    • Ubuntu (tested)
    • macOS Intel (supported)
    • macOS Apple Silicon (supported, uses CPU/MPS via PyTorch)

Quick start

git clone https://github.com/rundax/parakeet-asr.git
cd parakeet-asr
./setup.sh
./start-parakeet.sh

Health check:

curl http://localhost:9001/health

Stop service:

./stop-parakeet.sh

API

OpenAI-compatible transcription

curl -X POST http://localhost:9001/v1/audio/transcriptions \
  -F "file=@audio.mp3" \
  -F "model=parakeet-tdt-0.6b-v3"

Health

curl http://localhost:9001/health

OS notes

Ubuntu / Linux

setup.sh installs dependencies via apt, dnf, yum, or pacman when available.

  • Default install: CPU PyTorch wheels
  • Optional CUDA 11.8 wheels:
USE_CUDA=1 ./setup.sh

macOS (Intel + Apple Silicon)

setup.sh uses Homebrew to install:

  • ffmpeg
  • libsndfile
  • python@3.12

Then it installs PyTorch from standard pip wheels.

System service (Linux)

sed -e "s|{{USER}}|$USER|g" -e "s|{{APP_DIR}}|$PWD|g" parakeet-asr.service.template > parakeet-asr.service
sudo cp parakeet-asr.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now parakeet-asr

Troubleshooting

Model download / HF issues

First launch downloads model weights from Hugging Face. Ensure internet access.

ffmpeg errors

Reinstall ffmpeg + libsndfile (or run ./setup.sh again).

Slow on CPU

Use shorter audio files, or enable GPU on Linux.

OpenClaw Skill (ClawHub)

If you use OpenClaw and want one-command agent setup, use the packaged skill from this repo:

  • Skill source: openclaw-skill/parakeet-local-asr/
  • Package command:
python ~/.npm-global/lib/node_modules/openclaw/skills/skill-creator/scripts/package_skill.py openclaw-skill/parakeet-local-asr

This generates a .skill bundle you can upload to ClawHub.

License

MIT for this repo code. Model weights/license remain under NVIDIA terms (CC BY 4.0 as published by model provider).

About

A locally-hosted speech recognition service using NVIDIA's Parakeet TDT 0.6B v3 model, providing OpenAI-compatible transcription API.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors