Rust re-implementation of the pipecat-ai/smart-turn endpoint detector. It wraps ONNX Runtime via the ort crate and exposes a small API plus an example CLI for running predictions locally.
- Example program (
examples/basic.rs) demonstrates loading audio, computing features, and printing prediction results.
Smart Turn expects exactly 8 seconds of 16 kHz mono PCM audio per prediction. Provide audio in WAV format with:
- Sample rate:
16_000Hz - Channels: mono
- Duration: at least 8 seconds (shorter inputs should be zero-padded to 8 seconds; longer clips should be truncated to the newest 8 seconds)
Feeding data that violates these constraints leads to preprocessing errors or unreliable predictions.
- Ensure Rust 1.81 or newer (matching the
ortcrate requirement). - Fetch model weights, e.g.
smart-turn-v3.0.onnxfrom the upstream repo. - Clone this repository and run:
The example prints prediction, probability, real-time factor, and a JSON payload.
cargo run --example basic -- --audio <path/to/8s_16k.wav> --model <path/to/smart-turn-v3.0.onnx>
Add the crate to your project (currently via path/git dependency) and call the predictor:
use smart_turn_rs::{SmartTurnPredictor, features::log_mel_spectrogram};
use ndarray::Array3;
use std::path::Path;
fn run_smart_turn(model_path: &Path, features: Array3<f32>) -> anyhow::Result<()> {
let mut predictor = SmartTurnPredictor::new(model_path)?;
let result = predictor.predict(features)?;
println!("prediction: {} (p={:.3})", result.prediction, result.probability);
Ok(())
}Follows the upstream project: BSD 2-Clause License. Refer to the original pipecat-ai/smart-turn for model provenance and additional documentation.