Phone Whisper

Push-to-talk dictation for Android.

Phone Whisper lets you speak into most apps without switching keyboards. Tap the floating button, speak, tap again, and your text is inserted into the currently focused text field when the app exposes a standard Android input field.\

It supports:

Local on-device transcription with sherpa-onnx
Cloud transcription with OpenAI Whisper
Optional cleanup with OpenAI to fix punctuation and grammar

If you try it and it genuinely saves you time, consider sponsoring

Why I built this

I like SwiftKey and want to keep it as keyboard but...
Most keyboard dictation felt too inaccurate
Gemini's voice input auto submits your transcription (which is pretty bad) so you can't edit it before sending
Post processing yields much better results, specially adding a list of keywords and technical terms you often use
Inserting text into the field you're already using lets you keep editing it like any other draft.

Install

Easiest: download the APK

Grab the latest APK from GitHub Releases.

Open it on your phone, install it, then launch the app once to finish setup.

Build from source

Requires JDK 17 and Android SDK.

git clone https://github.com/kafkasl/phone-whisper.git && cd phone-whisper
make build

APK output:

app/build/outputs/apk/debug/app-debug.apk

If you use ADB:

make adb-install

How it works

A small overlay button floats on screen
Tap once to start recording
Tap again to stop
Audio is transcribed locally or in the cloud
The text is inserted into the focused text field
If insertion fails, the text is copied to the clipboard

Setup

First-time setup

Open Phone Whisper
Grant the audio recording permission
Enable the Accessibility Service
Choose your transcription mode:
- Local: download a model in the app
- Cloud: paste your OpenAI API key

Once setup is done, the floating button is ready.

Why does it need Accessibility?

Phone Whisper uses Android Accessibility Service for one narrow reason: to insert dictated text into the currently focused text field across apps.

It does not replace your keyboard. It does not run background automation. It only acts after you explicitly tap the overlay button.

Privacy

Phone Whisper supports two modes:

Local mode: audio stays on-device
Cloud mode: audio is sent directly from your device to OpenAI's transcription API
Optional cleanup: transcript text is sent directly from your device to OpenAI's chat API

I don't run a backend for this app. In cloud mode, requests go straight from your phone to OpenAI using your own API key.

Full policy: PRIVACY.md

Local models

Models are stored in app storage under:

/data/data/com.kafkasl.phonewhisper/files/models/

Current catalog:

Model	Size	Notes
Parakeet 110M	100 MB	Best default
Whisper Base	199 MB	Solid baseline
Parakeet 0.6B	465 MB	Best quality
Moonshine Tiny	103 MB	Fastest

The app downloads and extracts models directly from the sherpa-onnx release archives.

Development

make build       # build debug APK
make test        # run unit tests
make adb-install # build + install via ADB
make clean       # clean build artifacts

App compatibility

Phone Whisper works best in apps that use standard Android text fields. Some apps use custom text surfaces or terminal-style views, which may not support direct accessibility paste. When insertion is not possible, Phone Whisper falls back to copying the transcript to the clipboard.

Termux

Termux's main terminal area is not a standard Android text field, so direct insertion may not work there.

To use Phone Whisper in Termux:

Focus Termux
Swipe the extra keys row (ESC, CTRL, ALT, arrows, etc.) left or right
Switch to Termux's native text input box
Dictate there

Once text is inserted into the native input box, Termux sends it to the terminal normally.

Current limitations

Accessibility permission is required for cross-app insertion
Some apps may block paste or text injection
Some apps use custom input surfaces instead of standard Android text fields
Local models are large
Cloud mode requires your own OpenAI API key

Support the project

If Phone Whisper saves you time, you can sponsor the project on GitHub:

https://github.com/sponsors/kafkasl

License

Personal project. Do whatever you want with it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phone Whisper

Why I built this

Install

Easiest: download the APK

Build from source

How it works

Setup

First-time setup

Why does it need Accessibility?

Privacy

Local models

Development

App compatibility

Termux

Current limitations

Support the project

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Phone Whisper

Why I built this

Install

Easiest: download the APK

Build from source

How it works

Setup

First-time setup

Why does it need Accessibility?

Privacy

Local models

Development

App compatibility

Termux

Current limitations

Support the project

License