GitHub - rchardx/aurotype: AI-powered voice input method

Voice-to-text desktop app — speak naturally, get polished text injected where your cursor is.

Aurotype is a desktop app built with Tauri 2 (Rust) + React/TypeScript + a Python sidecar. Press a hotkey, speak, and the transcribed & polished text is automatically inserted at your cursor position.

Features

🎤 Push-to-talk recording via global hotkey (default: Ctrl+Alt+Space)
🗣️ Speech-to-text powered by Alibaba Cloud DashScope Paraformer (configurable)
✨ LLM polishing — cleans up filler words, fixes grammar, preserves mixed-language speech
📋 Smart paste fallback — copies text to clipboard when cursor isn't in an editable field
📜 Recording history — review and copy past transcriptions
🔌 OpenAI-compatible LLM — works with DeepSeek, OpenAI, vLLM, Ollama, LM Studio, etc.
🔒 Privacy first — all recordings, transcription history, and settings are stored locally on your machine. Nothing is uploaded except to your configured STT/LLM API endpoints.

Installation

Aurotype currently supports Windows.

Download the latest .exe or .msi installer from the GitHub Releases page. Both options work; pick the one you prefer.
Run the installer and follow the prompts.
Once launched, the app runs in your system tray.

Quick Start

Follow these steps for your first successful voice transcription:

Launch Aurotype — The app appears in the system tray (bottom-right). Click the tray icon to open Settings.
Check Engine Status — At the top of the Settings page, confirm that "Engine Status" shows Connected (green). If it shows disconnected, wait a few seconds for the engine to start.
Configure STT (Speech-to-Text) — In the "STT Provider" section:
- Provider: Alibaba Cloud DashScope (default).
- API Key: Paste your DashScope API key (get one at https://dashscope.console.aliyun.com/).
- Model: paraformer-realtime-v2 (default).
- Click Test Connection — It should show "Success!".
Configure LLM (Text Polishing) — In the "LLM Provider" section:
- Provider: DeepSeek (default) or OpenAI Compatible.
- If DeepSeek: Paste your DeepSeek API key (get one at https://platform.deepseek.com/).
- If OpenAI Compatible: Set your Base URL, API key, and model name (works with OpenAI, vLLM, Ollama, LM Studio, etc.).
- Click Test Connection — It should show "Success!".
Try it! — Press Ctrl+Alt+Space (default hotkey), speak naturally, and press Ctrl+Alt+Space again to stop. A floating overlay shows the recording status and transcription progress. The polished text is automatically pasted at your cursor position. If no text field is focused, the text is copied to your clipboard.

Hotkey

Default: Ctrl+Alt+Space (toggle mode — press to start, press again to stop).
Alternative mode: "Hold to Record" (hold key to record, release to stop).
Change these settings in the Settings → Hotkey section.
Available shortcuts: Ctrl+Alt+Space, Ctrl+Shift+Space, Ctrl+Shift+A, Ctrl+Shift+R, Ctrl+Shift+V, Ctrl+Space, F9, F10.

Architecture

src/              → React 19 + TypeScript frontend (Vite 7)
src-tauri/src/    → Rust/Tauri 2 backend (hotkeys, tray, sidecar, text injection)
engine/           → Python 3.12 sidecar (FastAPI + uvicorn, STT/LLM providers)

Tauri spawns the Python engine as a sidecar process. They communicate over HTTP on localhost (dynamic port).

Building from Source

Prerequisites

Bun (or Node.js)
Rust (stable)
Python 3.12+ with uv

Setup & Run

make setup          # Install frontend + Python dependencies
make dev            # Run full app (Tauri + Vite + Python sidecar)

To run the Python engine standalone: make engine-dev

To build a release installer: bun run tauri build

Configuration (Environment Variables)

Most users should configure Aurotype through the in-app Settings page (see Quick Start above). Environment variables are for advanced use or automation. Alternatively, use variables with the AUROTYPE_ prefix:

Variable	Default	Description
`AUROTYPE_STT_PROVIDER`	`aliyun_dashscope`	Speech-to-text provider
`AUROTYPE_LLM_PROVIDER`	`deepseek`	LLM provider for text polishing
`AUROTYPE_ALIYUN_DASHSCOPE_API_KEY`	—	Alibaba Cloud DashScope API key (for STT)
`AUROTYPE_DEEPSEEK_API_KEY`	—	DeepSeek API key
`AUROTYPE_OPENAI_API_KEY`	—	OpenAI-compatible API key
`AUROTYPE_LLM_BASE_URL`	—	Custom LLM endpoint URL
`AUROTYPE_LLM_MODEL`	—	Override LLM model name
`AUROTYPE_SYSTEM_PROMPT`	—	Custom system prompt for polishing
`AUROTYPE_LANGUAGE`	`auto`	Target language

Development

Testing

# Python
cd engine && uv run pytest ../tests/ -v

# TypeScript
bunx tsc --noEmit

# Rust
cd src-tauri && cargo test
cd src-tauri && cargo clippy -- -D warnings

CI

GitHub Actions runs on every push to main and on pull requests: Python tests, TypeScript type check, Rust check + clippy.

License

This project is licensed under the GNU General Public License v3.0.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.agents/skills		.agents/skills
.github/workflows		.github/workflows
.vscode		.vscode
assets		assets
docs		docs
engine		engine
public		public
src-tauri		src-tauri
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
.release-please-manifest.json		.release-please-manifest.json
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
bun.lock		bun.lock
float.html		float.html
index.html		index.html
package.json		package.json
release-please-config.json		release-please-config.json
skills-lock.json		skills-lock.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features

Installation

Quick Start

Hotkey

Architecture

Building from Source

Prerequisites

Setup & Run

Configuration (Environment Variables)

Development

Testing

CI

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Features

Installation

Quick Start

Hotkey

Architecture

Building from Source

Prerequisites

Setup & Run

Configuration (Environment Variables)

Development

Testing

CI

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages