Skip to content

Commit 0d849dd

Browse files
committed
Updating default behavior
Fix errors Fix Fixing errors
1 parent 6a0384c commit 0d849dd

File tree

12 files changed

+153
-75
lines changed

12 files changed

+153
-75
lines changed

ARCHITECTURE.md

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ OneVox uses a model-centric architecture where the backend is automatically sele
1111

1212
| Feature | whisper.cpp | ONNX Runtime |
1313
|---------|-------------|--------------|
14-
| **Build** | Default | `--features onnx` |
14+
| **Build** | Default | Default (included) |
1515
| **Selection** | Auto (GGML models) | Auto (ONNX/Parakeet models) |
1616
| **Stability** | Production-ready | Experimental |
1717
| **Speed** | 50-200ms | Varies by model |
@@ -104,7 +104,7 @@ let transcription = model.transcribe(&audio_samples, 16000)?;
104104

105105
**Build:**
106106
```bash
107-
cargo build --release --features onnx
107+
cargo build --release # ONNX support included by default
108108
```
109109

110110
**Implementation:** `src/models/onnx_runtime.rs` (571 lines)
@@ -195,11 +195,11 @@ pub trait ModelRuntime: Send + Sync {
195195
[model]
196196
# Backend auto-detected from model_path
197197
# - GGML models (ggml-*) use whisper.cpp
198-
# - Parakeet/ONNX models use ONNX Runtime (requires --features onnx)
198+
# - Parakeet/ONNX models use ONNX Runtime (included by default)
199199

200200
model_path = "ggml-base.en" # English-only (whisper.cpp)
201201
# model_path = "ggml-base" # Multilingual, 99+ languages (whisper.cpp)
202-
# model_path = "parakeet-ctc-0.6b" # ONNX model (requires --features onnx)
202+
# model_path = "parakeet-ctc-0.6b" # ONNX model (included by default)
203203

204204
# Device selection
205205
device = "auto" # auto, cpu, gpu
@@ -225,7 +225,7 @@ preload = true
225225
- `ggml-large-v3` (2.9GB)
226226
- `ggml-large-v3-turbo` (1.6GB)
227227

228-
*ONNX (requires --features onnx):*
228+
*ONNX (included by default):*
229229
- `parakeet-ctc-0.6b` - Multilingual, INT8 quantized
230230

231231
**Switching models:**
@@ -237,11 +237,11 @@ preload = true
237237

238238
```toml
239239
[features]
240-
default = ["whisper-cpp", "overlay-indicator"]
240+
default = ["whisper-cpp", "onnx", "overlay-indicator"]
241241

242-
# Model backends (mutually exclusive in practice, but can coexist)
243-
whisper-cpp = ["whisper-rs"] # Native whisper.cpp (recommended)
244-
onnx = ["ort", "ort-sys", "ndarray"] # ONNX Runtime (multilingual)
242+
# Model backends
243+
whisper-cpp = ["whisper-rs"] # Native whisper.cpp (default)
244+
onnx = ["ort", "ort-sys", "ndarray"] # ONNX Runtime (default)
245245
candle = ["candle-core", "candle-nn", "candle-transformers"] # Pure Rust (experimental)
246246

247247
# GPU acceleration (whisper-cpp only)
@@ -257,20 +257,17 @@ overlay-indicator = ["eframe", "winit"] # Visual recording indicator
257257

258258
**Build examples:**
259259
```bash
260-
# Default (whisper.cpp + overlay)
260+
# Default (includes both whisper.cpp and ONNX)
261261
cargo build --release
262262

263-
# With ONNX support
264-
cargo build --release --features onnx
265-
266-
# Both backends available (larger binary)
267-
cargo build --release --features "whisper-cpp,onnx"
263+
# Whisper.cpp only (minimal build)
264+
cargo build --release --no-default-features --features whisper-cpp
268265

269266
# GPU-accelerated whisper.cpp (macOS)
270267
cargo build --release --features metal
271268

272-
# ONNX + TUI
273-
cargo build --release --features "onnx,tui"
269+
# GPU-accelerated with ONNX
270+
cargo build --release --features "metal"
274271
```
275272

276273
## Design Principles

CONTRIBUTING.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,8 @@ cargo test
4747
6. **Link issues** - Reference any related issues
4848

4949
**For model backend changes:**
50-
- Test both `cargo build --release` and `cargo build --release --features onnx`
51-
- Verify existing tests pass with both backends
50+
- Add appropriate tests for both whisper.cpp and ONNX models
51+
- Verify existing tests pass with both backends (both included by default)
5252
- Update ARCHITECTURE.md if behavior changes
5353

5454
## Areas for Contribution
@@ -118,14 +118,13 @@ cargo test --features onnx
118118
When contributing changes that affect model backends:
119119

120120
```bash
121-
# Test default backend (whisper.cpp)
121+
# Test default build (includes both whisper.cpp and ONNX)
122122
cargo build --release
123123
cargo test
124124
./target/release/onevox daemon --foreground
125125

126-
# Test ONNX backend (if applicable)
127-
cargo build --release --features onnx
128-
cargo test --features onnx
126+
# Test ONNX models specifically
127+
cargo test --release
129128
# Edit config: model_path = "parakeet-ctc-0.6b"
130129
./target/release/onevox daemon --foreground
131130

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ enigo = "0.6" # Keyboard/mouse simulation for text injection
8989
libc = "0.2"
9090

9191
[features]
92-
default = ["whisper-cpp", "overlay-indicator"] # Native whisper.cpp is the primary backend
92+
default = ["whisper-cpp", "onnx", "overlay-indicator"] # Include both backends by default
9393

9494
# Model backends
9595
whisper-cpp = ["whisper-rs"]

DEVELOPMENT.md

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,33 @@ cargo build --release
1010

1111
## Build
1212

13-
### Default Backend (whisper.cpp)
13+
```bash
14+
# Debug build (includes ONNX by default)
15+
cargo build
16+
17+
# Release build (includes ONNX by default)
18+
cargo build --release
1419

15-
**First Build (macOS only):**
20+
# Minimal build (whisper.cpp only, no ONNX)
21+
cargo build --release --no-default-features --features whisper-cpp
22+
```
1623

17-
macOS requires environment variables on first build:
24+
### ONNX Runtime Support
25+
26+
ONNX support is **included by default** in all builds. This enables the Parakeet model and other ONNX models.
27+
28+
**What's included:**
29+
- Downloads ONNX Runtime binaries automatically (~150MB) via `ort-sys`
30+
- Builds ONNX inference backend
31+
- Enables ONNX model support (Parakeet, etc.)
32+
- Increases binary size by ~30MB
1833

34+
**Testing ONNX:**
1935
```bash
20-
CC=clang CXX=clang++ SDKROOT=$(xcrun --show-sdk-path) MACOSX_DEPLOYMENT_TARGET=13.0 \
21-
cargo build --release
36+
# ONNX is available by default
37+
./target/release/onevox config init
38+
# Edit config.toml: model_path = "parakeet-ctc-0.6b"
39+
./target/release/onevox daemon --foreground
2240
```
2341

2442
**Why?** whisper.cpp compiles from source and needs proper SDK paths.
@@ -139,7 +157,7 @@ src/
139157
├── vad/ # Voice Activity Detection
140158
├── models/ # Transcription models
141159
│ ├── whisper_cpp.rs # whisper.cpp backend (default)
142-
│ ├── onnx_runtime.rs # ONNX Runtime backend (--features onnx)
160+
│ ├── onnx_runtime.rs # ONNX Runtime backend (default)
143161
│ ├── whisper_candle.rs # Pure Rust backend (experimental)
144162
│ └── runtime.rs # ModelRuntime trait
145163
├── platform/ # Platform-specific
@@ -160,7 +178,7 @@ scripts/ # Installation and packaging scripts
160178

161179
**Core:**
162180
- `whisper-rs` - Native whisper.cpp bindings (default backend)
163-
- `ort` + `ort-sys` - ONNX Runtime bindings (optional, `--features onnx`)
181+
- `ort` + `ort-sys` - ONNX Runtime bindings (default, included in all builds)
164182
- `handy-keys` - Global hotkey detection
165183
- `cpal` - Cross-platform audio
166184
- `enigo` - Text injection

INSTALLATION.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ OneVox is available in two build configurations:
4949
- **Latency**: Varies by model
5050
- **Models**: Parakeet, custom ONNX models
5151

52-
**Installation**: Build from source with `--features onnx` flag (see [Build from Source](#build-from-source) below)
52+
**Installation**: Build from source (ONNX support included by default, see [Build from Source](#build-from-source) below)
5353

5454
---
5555

@@ -408,7 +408,7 @@ After building, edit your config file to select a model:
408408
# Backend is auto-detected from model_path
409409
model_path = "ggml-base.en" # English-only (whisper.cpp)
410410
# model_path = "ggml-base" # Multilingual (whisper.cpp, 99+ languages)
411-
# model_path = "parakeet-ctc-0.6b" # ONNX model (requires --features onnx build)
411+
# model_path = "parakeet-ctc-0.6b" # ONNX model (included by default)
412412

413413
device = "auto" # auto, cpu, gpu
414414
preload = true

QUICKREF.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ sample_rate = 16000
100100
# Model identifier (backend auto-detected from path)
101101
model_path = "ggml-base.en" # English-only, ~142MB
102102
# model_path = "ggml-base" # Multilingual (99+ languages)
103-
# model_path = "parakeet-ctc-0.6b" # ONNX (requires --features onnx)
103+
# model_path = "parakeet-ctc-0.6b" # ONNX (included by default)
104104

105105
device = "auto" # auto, cpu, gpu
106106
preload = true # Load model at startup
@@ -109,7 +109,7 @@ preload = true # Load model at startup
109109
**Available Models:**
110110
- **English-only**: `ggml-tiny.en` (75MB), `ggml-base.en` (142MB), `ggml-small.en` (466MB), `ggml-medium.en` (1.5GB)
111111
- **Multilingual**: `ggml-tiny` (75MB), `ggml-base` (142MB), `ggml-small` (466MB), `ggml-medium` (1.5GB), `ggml-large-v2/v3` (2.9GB), `ggml-large-v3-turbo` (1.6GB)
112-
- **ONNX**: `parakeet-ctc-0.6b` (multilingual, INT8, requires `--features onnx` build)
112+
- **ONNX**: `parakeet-ctc-0.6b` (multilingual, INT8, included by default)
113113

114114
Multilingual models automatically detect the spoken language. Backend is auto-selected based on model name.
115115

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -76,11 +76,11 @@ cargo build --release
7676
- Alternative models (Parakeet CTC, etc.)
7777
- INT8 quantization for faster inference
7878
- ~250MB memory usage
79-
- Requires `--features onnx` flag
79+
- Included by default (no special flags needed)
8080

8181
```bash
82-
# Build with ONNX support
83-
cargo build --release --features onnx
82+
# Build (includes ONNX support by default)
83+
cargo build --release
8484
```
8585

8686
Backend selection is automatic based on model choice (see Configuration below).
@@ -159,7 +159,7 @@ OneVox uses a model-centric architecture where the backend is automatically sele
159159
- Alternative models with INT8 quantization
160160
- CPU-optimized inference
161161
- ~250MB memory usage
162-
- Requires `--features onnx` build flag
162+
- Included by default in all builds
163163

164164
**Model Selection:**
165165
```toml
@@ -168,7 +168,7 @@ OneVox uses a model-centric architecture where the backend is automatically sele
168168
# Backend is auto-detected from model_path
169169
model_path = "ggml-base.en" # Uses whisper.cpp, English-only
170170
# model_path = "ggml-base" # Uses whisper.cpp, multilingual (auto-detect language)
171-
# model_path = "parakeet-ctc-0.6b" # Uses ONNX Runtime (requires --features onnx)
171+
# model_path = "parakeet-ctc-0.6b" # Uses ONNX Runtime (included by default)
172172
device = "auto" # or "cpu", "gpu"
173173
preload = true
174174
```

config.example.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ adaptive = true
109109
# - ggml-large-v3 (~2.9GB, 99+ languages)
110110
# - ggml-large-v3-turbo (~1.6GB, 99+ languages, faster)
111111
#
112-
# Available ONNX models (requires --features onnx build):
112+
# Available ONNX models (included by default):
113113
# - parakeet-ctc-0.6b (multilingual, 100+ languages, 15-25x RT, INT8 quantized)
114114
#
115115
# Backend auto-detection:

src/daemon/lifecycle.rs

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -150,12 +150,32 @@ impl Lifecycle {
150150
break;
151151
}
152152
Err(e) => {
153+
let error_msg = e.to_string();
154+
155+
// Check if this is a model-related error (missing model file)
156+
let is_model_error = error_msg.contains("Model file not found")
157+
|| error_msg.contains("Model not found")
158+
|| error_msg.contains("Download GGML models")
159+
|| error_msg.contains("Model download incomplete");
160+
153161
if retry_count == 0 {
154162
error!("Failed to create dictation engine: {}", e);
155-
error!("⚠️ This is usually a permission issue. Please grant:");
156-
error!(" 1. Input Monitoring permission");
157-
error!(" 2. Accessibility permission");
158-
error!(" Then restart: launchctl kickstart -k gui/$(id -u)/com.onevox.daemon");
163+
164+
// Only show permission hints for non-model errors
165+
if !is_model_error {
166+
error!("⚠️ This is usually a permission issue. Please grant:");
167+
error!(" 1. Input Monitoring permission");
168+
error!(" 2. Accessibility permission");
169+
error!(" Then restart: launchctl kickstart -k gui/$(id -u)/com.onevox.daemon");
170+
}
171+
}
172+
173+
// Don't retry for model errors - they won't fix themselves
174+
if is_model_error {
175+
error!("❌ Cannot start without a valid model");
176+
error!(" Daemon will continue running but dictation won't work");
177+
error!(" Download a model and restart the daemon");
178+
break;
159179
}
160180

161181
retry_count += 1;

0 commit comments

Comments
 (0)