|
| 1 | +# macos-mic-keepwarm |
| 2 | + |
| 3 | +Fix the 2-5 second push-to-talk activation delay on macOS. |
| 4 | + |
| 5 | +If you use voice transcription apps like SuperWhisper, WhisperFlow, Wispr Flow, or any push-to-talk tool and experience a delay before recording starts, especially with AirPods or Bluetooth audio, this is for you. |
| 6 | + |
| 7 | +## The Problem |
| 8 | + |
| 9 | +macOS aggressively power-manages the microphone hardware on Apple Silicon Macs (M1/M2/M3/M4). When no app is actively using the mic, the hardware goes to sleep. The next time a push-to-talk app tries to record, it has to wake the mic hardware first, causing a 2-5 second delay. |
| 10 | + |
| 11 | +This means: |
| 12 | +- You press your push-to-talk key and start talking |
| 13 | +- The first 2-5 seconds of speech are lost or the app appears frozen |
| 14 | +- If you use it again quickly (within ~30-60 seconds), it's instant |
| 15 | +- Wait a minute, and the delay is back |
| 16 | + |
| 17 | +This is especially bad with AirPods and Bluetooth headsets, where the audio routing adds even more wake-up latency. |
| 18 | + |
| 19 | +## What I Tried (So You Don't Have To) |
| 20 | + |
| 21 | +I spent hours debugging this across SuperWhisper and WhisperFlow on macOS Tahoe 26.2 (M4 MacBook Air). Here's everything that did NOT fix the activation delay: |
| 22 | + |
| 23 | +- Changing the push-to-talk hotkey (tried Function key, Option+Space, Command+Shift+R) |
| 24 | +- Restarting `coreaudiod` and `corespeechd` |
| 25 | +- Increasing audio buffer sizes (`HALInputBufferSizeFrames`) |
| 26 | +- Changing audio sample rates |
| 27 | +- Resetting CoreAudio preferences |
| 28 | +- Disabling Continuity Camera |
| 29 | +- Granting Input Monitoring and Accessibility permissions |
| 30 | +- Changing microphone input sources |
| 31 | +- Restarting the Mac |
| 32 | +- pmset power management tweaks |
| 33 | +- nvram boot-args (not applicable on Apple Silicon) |
| 34 | + |
| 35 | +None of these address the hardware-level mic sleep behavior. |
| 36 | + |
| 37 | +### Related Issue: Recording Cutoff at 5-7 Seconds |
| 38 | + |
| 39 | +During debugging I also discovered that **Siri's Built-In Voice Trigger** (`CSBuiltInVoiceTrigger`) can interfere with push-to-talk apps, causing recordings to cut off after 5-7 seconds. If you're experiencing that issue too, disable Siri voice activation: |
| 40 | + |
| 41 | +1. System Settings > Siri & Spotlight |
| 42 | +2. Turn OFF "Listen for 'Siri'" / "Listen for 'Hey Siri'" |
| 43 | +3. Turn OFF "Press function key for Siri" |
| 44 | + |
| 45 | +### Related Issue: Virtual Audio Plugins Cause Delay |
| 46 | + |
| 47 | +Third-party audio drivers from Teams, Zoom, and other conferencing apps (installed at `/Library/Audio/Plug-Ins/HAL/`) can add startup latency. If you have `MSTeamsAudioDevice.driver`, `ZoomAudioDevice.driver`, or similar, try disabling them: |
| 48 | + |
| 49 | +```bash |
| 50 | +sudo mv /Library/Audio/Plug-Ins/HAL/MSTeamsAudioDevice.driver /Library/Audio/Plug-Ins/HAL/MSTeamsAudioDevice.driver.disabled |
| 51 | +sudo mv /Library/Audio/Plug-Ins/HAL/ZoomAudioDevice.driver /Library/Audio/Plug-Ins/HAL/ZoomAudioDevice.driver.disabled |
| 52 | +sudo killall coreaudiod |
| 53 | +``` |
| 54 | + |
| 55 | +Teams and Zoom still work for calls without these custom drivers. |
| 56 | + |
| 57 | +## The Fix |
| 58 | + |
| 59 | +A single lightweight background process that holds the microphone input stream open. The mic hardware stays powered on and ready, so push-to-talk activation is always instant. |
| 60 | + |
| 61 | +No virtual audio devices needed. No BlackHole, Loopback, or SoundFlower. Just ffmpeg reading from the mic and discarding the audio. |
| 62 | + |
| 63 | +### How It Works |
| 64 | + |
| 65 | +``` |
| 66 | +ffmpeg -f avfoundation -i ":0" -f null /dev/null |
| 67 | +``` |
| 68 | + |
| 69 | +That's it. ffmpeg opens the default audio input device and sends the audio to `/dev/null` (nowhere). Nothing is recorded, stored, or transmitted. The only effect is that the microphone hardware stays awake. |
| 70 | + |
| 71 | +- CPU usage: ~0% |
| 72 | +- Battery impact: negligible |
| 73 | +- Privacy: no audio is captured or stored anywhere |
| 74 | +- Works with: built-in mic, AirPods, Bluetooth headsets, USB mics, any input device |
| 75 | + |
| 76 | +### Note on the Orange Dot |
| 77 | + |
| 78 | +macOS will show the orange microphone indicator dot in the menu bar, attributed to "ffmpeg". This is accurate: ffmpeg has the mic open. But it's not listening to you. The audio goes straight to `/dev/null`. |
| 79 | + |
| 80 | +## Installation |
| 81 | + |
| 82 | +### Prerequisites |
| 83 | + |
| 84 | +Install ffmpeg if you don't have it: |
| 85 | + |
| 86 | +```bash |
| 87 | +brew install ffmpeg |
| 88 | +``` |
| 89 | + |
| 90 | +### Quick Start (Run Once) |
| 91 | + |
| 92 | +```bash |
| 93 | +chmod +x keep-mic-warm.sh |
| 94 | +./keep-mic-warm.sh |
| 95 | +``` |
| 96 | + |
| 97 | +### Persistent Install (Survives Reboots) |
| 98 | + |
| 99 | +```bash |
| 100 | +chmod +x install.sh |
| 101 | +./install.sh |
| 102 | +``` |
| 103 | + |
| 104 | +This creates a LaunchAgent that: |
| 105 | +- Starts automatically on login |
| 106 | +- Restarts automatically if killed |
| 107 | +- Runs silently in the background |
| 108 | + |
| 109 | +macOS will prompt you to grant ffmpeg microphone access on first run. Click "Allow". |
| 110 | + |
| 111 | +### Uninstall |
| 112 | + |
| 113 | +```bash |
| 114 | +chmod +x uninstall.sh |
| 115 | +./uninstall.sh |
| 116 | +``` |
| 117 | + |
| 118 | +## Why Don't Transcription Apps Do This? |
| 119 | + |
| 120 | +They should. SuperWhisper's own changelog acknowledges "handling push to talk shortcut if microphone is slow to start." The correct engineering solution is to keep the audio input stream open between recordings and use a ring buffer with lookback. When the user presses push-to-talk, start reading from the buffer, including audio captured just before the keypress. |
| 121 | + |
| 122 | +The likely reason they don't: the orange microphone indicator dot. Apps don't want users seeing "SuperWhisper is using your microphone" 24/7, even though the alternative is a broken user experience. |
| 123 | + |
| 124 | +Apple could fix this by providing a fast-wake API or a low-power standby mode for the mic hardware. As of macOS Tahoe 26.2, no such API exists. |
| 125 | + |
| 126 | +## Why Not Use BlackHole or a Virtual Audio Device? |
| 127 | + |
| 128 | +You don't need one. BlackHole, Loopback, and SoundFlower create virtual audio routing devices, which adds complexity and can introduce their own latency and compatibility issues. This fix works directly with your real microphone hardware. It's simpler and has fewer things that can break. |
| 129 | + |
| 130 | +## Affected Apps |
| 131 | + |
| 132 | +This delay affects any push-to-talk or voice transcription app on macOS, including but not limited to: |
| 133 | +- SuperWhisper |
| 134 | +- WhisperFlow |
| 135 | +- Wispr Flow |
| 136 | +- macOS Dictation |
| 137 | +- Any app that activates the microphone on-demand rather than continuously |
| 138 | + |
| 139 | +## System Requirements |
| 140 | + |
| 141 | +- macOS (tested on Tahoe 26.2, likely affects Sequoia and earlier) |
| 142 | +- Apple Silicon Mac (M1/M2/M3/M4) - Intel Macs may also be affected |
| 143 | +- ffmpeg (`brew install ffmpeg`) |
| 144 | +- Microphone permission for ffmpeg |
| 145 | + |
| 146 | +## License |
| 147 | + |
| 148 | +MIT |
0 commit comments