Skip to content
/ VTS Public

Transform your voice into text instantly with the power of OpenAI, Groq, and Deepgram APIs. Say goodbye to macOS dictation limitations and hello to lightning-fast, accurate transcription! ⚡️

License

Notifications You must be signed in to change notification settings

j05u3/VTS

VTS - Voice Typing Studio

VTS Logo
The open-source macOS dictation replacement you've been waiting for! 🚀

GitHub Stars GitHub Forks
Latest Release GitHub Downloads License: MIT
Build Status GitHub Issues GitHub Pull Requests
Platform

🔊 Turn on your sound! This demo includes audio to showcase the real-time transcription experience.

mini_demo_01.mov

Transform your voice into text instantly with the power of OpenAI, Groq, and Deepgram APIs. Say goodbye to macOS dictation limitations and hello to lightning-fast, accurate transcription with your own custom hotkeys! ⚡️

📋 Table of Contents

✨ Why Choose VTS?

  • 🤖 AI-Powered Accuracy: Leverage OpenAI, Groq, and Deepgram models for superior transcription
  • 🔑 Your Keys, Your Control: Bring your own API keys - no subscriptions, no limits
  • 🔄 Drop-in Replacement: Works exactly like macOS dictation, but better!
  • ⌨️ Your Shortcut, Your Rules: Fully customizable global hotkeys (default: ⌘⇧;)
  • 🎯 Smart Device Management: Intelligent microphone priority with seamless fallback
  • 💬 Context-Aware: Custom system prompt boosts accuracy for your specific needs
  • 🔓 100% Open Source: Full transparency, community-driven, modify as you wish

📷 Screenshots

image image image

🎬 Longer Demo

Monosnap.screencast.2025-07-23.02-48-42.mp4

Onboarding:

https://youtu.be/NTQmVCvkZQQ

🚀 Getting Started

📦 Installation

Ready to use VTS? Head over to our Releases Page to download the latest version.

Download

  • Universal Binary (Intel + Apple Silicon): Download the DMG from the releases page

Installation Steps

  1. Download the DMG file from the releases page
  2. Open the DMG
  3. Drag VTS to Applications folder
  4. Launch from Applications

Verification

All releases are code-signed and notarized by Apple for security.

Requirements

  • macOS 14.0+ (Apple Silicon & Intel supported)
  • API key from OpenAI, Groq, or Deepgram (see setup below)

API Key Setup

After installing VTS, you'll need an API key from one of these providers:

Only one API key is required - choose the provider you prefer!

📖 Usage Guide

Basic Transcription

  1. Choose Provider: Select OpenAI, Groq, or Deepgram from the dropdown
  2. Select Model: Pick whisper-1, whisper-large-v3, or other available models
  3. Enter API Key: Paste your API key in the secure field
  4. Start Recording: Press the global hotkey (default: ⌘⇧;) and speak
  5. View Results: See real-time transcription inserted into the application you're using
  6. (Optional) Copy: Use buttons to copy the transcript

Advanced Features

Microphone Priority Management

  • View Available Devices: See all connected microphones with system default indicators
  • Set Priority Order: Add devices to priority list with + buttons
  • Automatic Fallback: App automatically uses highest-priority available device
  • Real-time Switching: Seamlessly switches when preferred devices connect/disconnect
  • Remove from Priority: Use − buttons to remove devices from priority list

Custom System Prompts

  • Add context-specific prompts to improve transcription accuracy
  • Examples: "Medical terminology", "Technical jargon", "Names: John, Sarah, Mike"
  • Prompts help the AI better understand domain-specific language

🔒 Privacy & Security

  • No audio storage: Audio is processed in real-time, never stored locally
  • API keys are safe: Keys are stored in Keychain
  • TLS encryption: All API communication uses HTTPS
  • Microphone permission: Explicit user consent required for audio access
  • Basic telemetry: We collect minimal usage analytics in compliance with GDPR regulations

🛠️ Troubleshooting

Common Issues

  • Microphone Permission Denied: Check System Settings > Privacy & Security > Microphone
  • No Microphones Found: Click "Refresh" in the Microphone Priority section
  • Wrong Microphone Active: Set your preferred priority order or check device connections
  • App Not Responding to Hotkey: Ensure accessibility permissions are granted when prompted

👩‍💻 Development

This section is for developers who want to build VTS from source or contribute to the project.

Development Requirements

  • macOS 14.0+ (Apple Silicon & Intel supported)
  • Xcode 15+ for building
  • API key from OpenAI, Groq, or Deepgram for testing

Building from Source

  1. Clone the repository:
git clone https://github.com/j05u3/VTS.git
cd VTS
  1. Open in Xcode:
open VTSApp.xcodeproj
  1. Build and run:
    • In Xcode, select the VTSApp scheme
    • Build and run with ⌘R
    • Grant microphone permission when prompted

Command Line Building

# Build via command line
xcodebuild -project VTSApp.xcodeproj -scheme VTSApp build

Architecture

VTS follows a clean, modular architecture:

  • CaptureEngine: Handles audio capture using AVAudioEngine with Core Audio device management
  • DeviceManager: Manages microphone priority lists and automatic device selection
  • TranscriptionService: Orchestrates streaming transcription with provider abstraction
  • STTProvider Protocol: Clean interface allowing easy addition of new providers
  • Modern SwiftUI: Reactive UI with proper state management and real-time updates

Testing

Currently, VTS includes manual testing capabilities through the built-in Text Injection Test Suite accessible from the app's interface. This allows you to test text insertion functionality across different applications.

Automated unit tests are planned for future releases.

Development Troubleshooting

Accessibility Permissions (Development)

  • Permission Not Updating: During development/testing, when the app changes (rebuild, code changes), macOS treats it as a "new" app
  • Solution: Remove the old app entry from System Settings > Privacy & Security > Accessibility, then re-grant permission
  • Why This Happens: Each build gets a different signature, so macOS sees it as a different application
  • Quick Fix: Check the app list in Accessibility settings and remove any old/duplicate VTS entries

Testing Onboarding Flow

  • Reset App State: To test the complete onboarding flow, change the PRODUCT_BUNDLE_IDENTIFIER in Xcode project settings
  • Why This Works: Changing the bundle identifier creates a "new" app from macOS perspective, resetting all permissions and app state
  • Most Reliable Method: This is more reliable than clearing UserDefaults and ensures a clean onboarding test including all system permissions

Contributing

See CONTRIBUTING.md for details on how to contribute to VTS development.


🗺️ Roadmap

  • Auto-open at login: Auto-open at login with checkbox in the preferences window (✅ Implemented)
  • Modern Release Automation: Automated releases with release-please and GitHub Actions (✅ Implemented)
  • Sparkle Auto-Updates: Automatic app updates with GitHub Releases appcast hosting (✅ Implemented)
  • Support real-time API: OpenAI Real-time Transcription API (✅ Implemented)

In a future or maybe pro version, to be decided/ordered by priority, your feedback and contributions are welcome!

  • More models/providers: Support for more STT providers like Google, Azure, etc.
  • Safe auto-cut: Auto-cut to maximum time if the user forgets to end (or accidentally starts). But also we could use VAD from real-time APIs?
  • LLM step: Use LLM to process the transcription and improve accuracy, maybe targetted to the app you're using or context in general. (Be able to easily input emojis?). I mean apply transformations based on the app you're injecting text to.
  • Advanced Audio Processing: Noise reduction and gain control, but also some STT providers can do this so maybe not needed?
  • Comprehensive Test Suite: Automated unit tests covering:
    • Core transcription functionality
    • Provider validation and error handling
    • Device management and priority logic
    • Integration flows and edge cases
  • Accessibility Features

💬 Feedback

Have feedback, suggestions, or issues? We'd love to hear from you!

📧 Send us your feedback - Quick and direct way to reach us

You can also:

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgements

VTS wouldn't be possible without the incredible work of the open-source community. Special thanks to:

Tools & Scripts

  • ios-icon-generator by @smallmuou - for the awesome icon generation script that made creating our app icons effortless
  • create-dmg by @sindresorhus - for the excellent DMG creation script that streamlines our distribution process
  • Sparkle by the Sparkle Project - for providing the robust auto-update framework that keeps VTS current and secure

Note: This project builds upon the work of many developers and projects. If I've missed crediting someone or something I sincerely apologize! Please feel free to open an issue or PR to help me give proper recognition where it's due.


Made with ❤️ for the macOS community

About

Transform your voice into text instantly with the power of OpenAI, Groq, and Deepgram APIs. Say goodbye to macOS dictation limitations and hello to lightning-fast, accurate transcription! ⚡️

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Sponsor this project

  •  

Contributors 6