Listen - Voice-to-Text App

A multi-platform voice-to-text app with intelligent model routing, allowing you to speak instead of typing.

Features

🎙️ System-wide voice recording (Desktop) / One-tap recording (Mobile)
🤖 Multiple SOTA STT models with automatic selection:
- Moonshine (5-15x faster, optimized for edge devices)
- Distil-Whisper (6x faster, excellent accuracy)
- Faster-Whisper, Whisper.cpp, Python Whisper
🧠 Intelligent model routing - Auto-selects best model for your needs
📋 Automatic clipboard copy
🪟 Always-on-top overlay (Desktop)
📱 Native iOS (Swift + WhisperKit) and Android (Kotlin + TFLite) apps
🔒 100% offline - All processing on-device, no cloud services
⚡ Ultra-fast transcription

Setup

Install dependencies:

npm install

Install an STT model (choose one or more):

🔥 UNDER 1B Parameters (Recommended - Edge-optimized)

Option A: Parakeet TDT v3 (FASTEST)
```
./install-parakeet.sh  # 600M params, 6.32% WER, 25 languages, ultra-fast inference
```
Option B: Moonshine (Mobile-optimized)
```
./install-moonshine.sh  # 40-200M params, 5-15x real-time
```
Option C: Distil-Whisper (Best for English)
```
./install-distil-whisper.sh  # 244M params, 6x real-time
```
Option D: Faster-Whisper (Good balance)
```
pip install faster-whisper  # 74M params, 4x real-time
```
Option E: whisper.cpp (C++ implementation)
```
./setup-whisper.sh  # 74M params, 2x real-time
```
Option F: Python Whisper (Fallback)
```
./install-python-whisper.sh  # 74M params, baseline
```
🎯 OVER 1B Parameters (Optional - Maximum accuracy)

Option G: Canary Qwen 2.5B (#1 Accuracy)
```
./install-canary.sh  # 2.5B params, 5.63% WER, 
```
Note: The app will automatically use the fastest available model. Install multiple models for automatic fallback. Only models you install will be used.
Build and run:

npm run build
npm start

Or run in development mode:

npm run dev

Usage

Press Ctrl+Shift+Space to activate the overlay
Speak your text
Press Ctrl+Shift+Space again to stop recording
The transcribed text will be automatically copied to clipboard
Paste (Ctrl+V) in any application

Keyboard Shortcuts

Ctrl+Shift+Space - Start/Stop recording
Esc - Cancel recording and close overlay

Model Selection & Routing

Listen uses an intelligent routing system that automatically selects the best available model based on your requirements.

Recommended Models:

Desktop (English): Distil-Whisper Small (6x faster, excellent accuracy)
Desktop (Multilingual): Moonshine Base (5-15x faster, good accuracy)
Mobile (iOS/Android): Moonshine Tiny (ultra-fast, only 40MB)

See MODEL_COMPARISON.md for detailed benchmarks and comparisons.

Platform Support

✅ Linux (Desktop - Electron)
✅ iOS 16+ (Native Swift app) - See mobile/ios/README.md
✅ Android 7+ (Native Kotlin app) - See mobile/android/README.md
🔜 macOS (Desktop - Coming soon)
✅ Windows (Desktop - Initial support)

Project Structure

listen/
├── src/                    # TypeScript source code
│   ├── models/            # STT model implementations
│   ├── assets/            # UI (HTML/CSS)
│   └── main.ts            # Electron entry point
├── scripts/               # Python utility scripts
│   └── record_audio_windows.py
├── docs/                  # Documentation
└── mobile/                # Native iOS & Android apps

See ARCHITECTURE.md for complete structure.

Documentation

Architecture Overview - System design and modular architecture
Model Comparison - Detailed STT model benchmarks
Quick Start Guide - Get up and running in 5 minutes
iOS README - iOS app documentation
Android README - Android app documentation

Requirements

Node.js 18+
One of: whisper.cpp, Python whisper, or faster-whisper (local models, no API needed)
Audio recording: arecord (ALSA) or sox on Linux

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
docs		docs
mobile		mobile
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
FEATURES.md		FEATURES.md
MODEL_COMPARISON.md		MODEL_COMPARISON.md
MODEL_COMPARISON_UNDER_1B.md		MODEL_COMPARISON_UNDER_1B.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
SETUP.md		SETUP.md
WSL2_QUICK_SETUP.md		WSL2_QUICK_SETUP.md
install-canary.sh		install-canary.sh
install-distil-whisper.sh		install-distil-whisper.sh
install-moonshine.sh		install-moonshine.sh
install-parakeet.sh		install-parakeet.sh
install-python-whisper.sh		install-python-whisper.sh
listen.desktop		listen.desktop
package-lock.json		package-lock.json
package.json		package.json
parakeet_server.py		parakeet_server.py
requirements.txt		requirements.txt
setup-whisper.sh		setup-whisper.sh
test-full-pipeline.js		test-full-pipeline.js
test-parakeet-direct.py		test-parakeet-direct.py
tsconfig.json		tsconfig.json
window_focus.py		window_focus.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Listen - Voice-to-Text App

Features

Setup

🔥 UNDER 1B Parameters (Recommended - Edge-optimized)

🎯 OVER 1B Parameters (Optional - Maximum accuracy)

Usage

Keyboard Shortcuts

Model Selection & Routing

Platform Support

Project Structure

Documentation

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

divyanshsinghvi/OpenWhisper

Folders and files

Latest commit

History

Repository files navigation

Listen - Voice-to-Text App

Features

Setup

🔥 UNDER 1B Parameters (Recommended - Edge-optimized)

🎯 OVER 1B Parameters (Optional - Maximum accuracy)

Usage

Keyboard Shortcuts

Model Selection & Routing

Platform Support

Project Structure

Documentation

Requirements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages