AudioType

A native macOS menu bar app for voice-to-text. Hold fn to record, release to transcribe and type.

Features

Hold fn key to record voice, release to transcribe and insert text
Multiple cloud providers — Groq and OpenAI Whisper APIs for high accuracy
On-device fallback — Apple Speech (no API key or internet needed)
Works in any app — types transcribed text into the focused application
Self-serve — bring your own API key (Groq free tier, or OpenAI)
Lightweight — runs in menu bar, no dock icon

Privacy & Data

Important: AudioType previously ran transcription 100% locally using whisper.cpp. We found the local model quality insufficient for reliable daily use, so we switched to cloud-based Whisper APIs which provide significantly better accuracy and speed. An on-device Apple Speech fallback is available if you prefer no cloud usage.

What this means:

When using a cloud engine, audio recordings are sent to the provider's servers for transcription
An internet connection is required for cloud transcription (not needed for Apple Speech)
Your API keys are stored locally in the macOS Keychain
No audio is saved to disk locally — it is recorded in memory, sent to the cloud provider, and discarded
See Groq's data policy or OpenAI's data policy for how they handle your data

Looking for the privacy-focused local version?

If you prefer 100% offline transcription with no data leaving your machine, you can use AudioType v1.1.1 — the last release that runs transcription entirely on-device using a local OpenAI Whisper model via whisper.cpp. No internet or API key required. Note that local transcription accuracy is lower than the cloud version.

Requirements

macOS 14.0 (Sonoma) or later
Apple Silicon or Intel Mac
Internet connection (for cloud engines; not needed for Apple Speech)
A cloud API key (optional — app works without one using Apple Speech):
- Free Groq API key, or
- OpenAI API key

Setup

1. Get an API Key (optional)

AudioType works out of the box using Apple's on-device speech recognition. For higher accuracy, configure a cloud provider:

Option A: Groq (free tier)

Go to console.groq.com/keys
Create an account or sign in
Generate a new API key
Copy the key — you'll paste it into AudioType on first launch

Groq's free tier is generous enough for typical dictation use. See Groq's rate limits for current details.

Option B: OpenAI

Go to platform.openai.com/api-keys
Create an account or sign in
Generate a new API key
Copy the key — you'll paste it into AudioType Settings

2. Install AudioType

Download Release

Download the latest .dmg from Releases
Open the DMG and drag AudioType to Applications
First launch — Right-click the app and select "Open" (required for unsigned apps)
Click "Open" in the dialog to confirm

Note: Since this app is not notarized, macOS will block it on first launch. You can also bypass this via Terminal:
xattr -cr /Applications/AudioType.app

Build from Source

# Clone the repository
git clone https://github.com/PatelUtkarsh/audio-type.git
cd audio-type

# Build and create app bundle
make app

# Run the app
open AudioType.app

3. First Launch

On first launch, AudioType will ask you to:

Grant Microphone access — to record your voice
Grant Accessibility access — to type text into other apps
Grant Speech Recognition — for on-device Apple Speech
Enter a Groq API key (optional) — for cloud transcription

You can skip the API key step to use Apple Speech. Additional cloud providers (OpenAI) can be configured later in Settings.

Permissions

Permission	Purpose
Microphone	Record voice for transcription
Accessibility	Detect fn key and type text into apps
Speech Recognition	On-device Apple Speech transcription
Internet	Send audio to cloud provider (Groq or OpenAI)

Usage

Launch AudioType — appears in menu bar with a waveform icon
Hold fn key — starts recording (overlay shows waveform)
Release fn key — sends audio to the active engine and types the result
Click menu bar icon — access Settings or Quit

Settings

Engine Selection:
- Auto (default) — uses Groq if configured, then OpenAI, then Apple Speech
- Groq Whisper — always use Groq (requires API key)
- OpenAI Whisper — always use OpenAI (requires API key)
- Apple Speech — always use on-device recognition
Groq API Key — add or update your Groq key
OpenAI API Key — add or update your OpenAI key
Model Selection:
- Groq: Whisper Large V3 Turbo (default, faster) or Whisper Large V3 (most accurate)
- OpenAI: GPT-4o Mini Transcribe (default, balanced), GPT-4o Transcribe (best), or Whisper V2 (cheapest)
Language — auto-detect or choose from 25+ languages

How It Works

fn key held -> Record audio -> Release fn key
                                    |
                                    v
                            Encode audio as WAV
                                    |
                                    v
                            EngineResolver picks engine
                            (Groq / OpenAI / Apple Speech)
                                    |
                                    v
                            Text post-processing
                            (capitalization, corrections)
                                    |
                                    v
                            Simulate keyboard typing
                            into focused app

Tech Stack

Swift — native macOS app
Groq API — cloud speech-to-text (Whisper Large V3)
OpenAI API — cloud speech-to-text (GPT-4o Transcribe / Whisper)
Apple Speech — on-device speech-to-text (SFSpeechRecognizer)
AVAudioEngine — low-latency audio capture
CGEvent — global hotkey detection and keyboard simulation
macOS Keychain — secure API key storage

Troubleshooting

App doesn't respond to fn key

Check Accessibility permission in System Settings > Privacy & Security > Accessibility
Try removing and re-adding AudioType from the list

No audio captured

Check Microphone permission in System Settings > Privacy & Security > Microphone
Ensure your microphone is working in other apps

Transcription fails

Check your internet connection (for cloud engines)
Verify your API key is valid in Settings
If you see "Rate limited", wait a moment and try again
Check Groq status or OpenAI status for service issues

"API key required" error

Open Settings from the menu bar icon and enter your API key
Get a free Groq key at console.groq.com/keys
Or use Apple Speech (no key required) by setting engine to Auto or Apple Speech

Rate Limits

Groq offers a free tier that is generous enough for typical dictation use. For current limits and pricing, see Groq's rate limits and pricing.

OpenAI uses pay-as-you-go pricing. See OpenAI's pricing for current rates.

License

MIT

Acknowledgments

Groq for fast cloud inference
OpenAI for Whisper and GPT-4o transcription models
This project is entirely vibe coded with AI assistance

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github/workflows		.github/workflows
AudioType		AudioType
Resources		Resources
.gitignore		.gitignore
.swiftlint.yml		.swiftlint.yml
AGENTS.md		AGENTS.md
Makefile		Makefile
Package.swift		Package.swift
README.md		README.md
architecture.md		architecture.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AudioType

Features

Privacy & Data

Looking for the privacy-focused local version?

Requirements

Setup

1. Get an API Key (optional)

Option A: Groq (free tier)

Option B: OpenAI

2. Install AudioType

Download Release

Build from Source

3. First Launch

Permissions

Usage

Settings

How It Works

Tech Stack

Troubleshooting

App doesn't respond to fn key

No audio captured

Transcription fails

"API key required" error

Rate Limits

License

Acknowledgments

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

PatelUtkarsh/audio-type

Folders and files

Latest commit

History

Repository files navigation

AudioType

Features

Privacy & Data

Looking for the privacy-focused local version?

Requirements

Setup

1. Get an API Key (optional)

Option A: Groq (free tier)

Option B: OpenAI

2. Install AudioType

Download Release

Build from Source

3. First Launch

Permissions

Usage

Settings

How It Works

Tech Stack

Troubleshooting

App doesn't respond to fn key

No audio captured

Transcription fails

"API key required" error

Rate Limits

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages