Dual-crate architecture: whisper-cpp-plus-sys (raw FFI bindings) + whisper-cpp-plus (safe idiomatic API). Sys crate stays stable while the high-level API evolves.
Maps to whisper.cpp's thread safety model:
WhisperContextwrapswhisper_context*— immutable,Send + Sync, shareable viaArcWhisperStatewrapswhisper_state*— mutable,Sendonly, one per threadFullParams— create per transcription, notSend/Sync
WhisperState holds Arc<ContextPtr> to keep the context alive independently.
All enhancements live under src/enhanced/ and follow these rules:
- Types:
Enhancedprefix (EnhancedWhisperVadProcessor,EnhancedWhisperState) - Methods:
_enhancedsuffix (transcribe_with_params_enhanced) - Module:
whisper_cpp_plus::enhanced::{vad, fallback}
- VAD = preprocessing (before transcription) —
enhanced::vad - Temperature fallback = transcription quality enhancement —
enhanced::fallback - Features are orthogonal: use one, both, or neither independently
Enhanced modules access FFI via pub(crate) fields (e.g., WhisperState.ptr). Raw pointers never appear in the public API.
- Enhanced methods mirror base API shape — opt-in, never modifies defaults
- Base types stay clean of enhancement-specific concerns
whisper-cpp-plus/src/
├── lib.rs # Public API, convenience methods on WhisperContext
├── context.rs # WhisperContext (Arc<ContextPtr>)
├── state.rs # WhisperState (ptr + Arc<ContextPtr>)
├── params.rs # FullParams, TranscriptionParams builder
├── error.rs # WhisperError enum (thiserror)
├── buffer.rs # AudioBuffer (circular, for streaming)
├── stream.rs # WhisperStream (chunk-based streaming)
├── stream_pcm.rs # WhisperStreamPcm (port of stream-pcm.cpp)
├── vad.rs # WhisperVadProcessor (Silero VAD via whisper.cpp)
├── quantize.rs # WhisperQuantize (feature = "quantization")
├── async_api.rs # spawn_blocking wrappers (feature = "async")
└── enhanced/
├── mod.rs
├── vad.rs # EnhancedWhisperVadProcessor (segment aggregation)
└── fallback.rs # Temperature fallback with quality thresholds
whisper.cpp operations are CPU-bound. Async wrappers use tokio::task::spawn_blocking — no async whisper.cpp calls.
Zero-copy where possible: &[f32] slices passed directly to C++ via .as_ptr() + .len(). Audio must be 16kHz mono f32.
cmake crate compiles whisper.cpp sources. whisper.cpp vendored as git submodule at whisper-cpp-plus-sys/whisper.cpp (excluded from crates.io package - downloaded on demand for consumers). Prebuilt library caching available via cargo xtask prebuild.