A rust wrapper around streaming mode sherpa-onnx zipformer transducers.
It's very quicklike. Expect to be able to stay abreast of a realtime audio stream on 1-2 modest CPU cores.
For higher throughput applications (many streams served on the same machine), continuous batching is fully supported and significantly improves on per-stream compute utilization.
Add the dep:
cargo add sherpa-transducersAnd use it:
use sherpa_transducers::asr;
async fn my_stream_handler() -> anyhow::Result<()> {
let t = asr::Model::from_pretrained("nytopop/zipformer-en-2023-06-21-320ms")
.await?
.num_threads(2)
.build()?;
let mut s = t.phased_stream(1)?;
loop {
// use the sample rate of _your_ audio, input will be resampled automatically
let sample_rate = 24_000;
let audio_samples = vec![0.; 512];
// buffer some samples to be decoded
s.accept_waveform(sample_rate, &audio_samples);
// actually do the decode
s.decode();
// get the transcript since last reset
let (epoch, transcript) = s.result()?;
if transcript.contains("DELETE THIS") {
s.reset();
}
}
}Default features:
static: Compile and linksherpa-onnxstaticallydownload-models: Enable support for loading pretrained transducers from huggingface
Features disabled by default:
cuda: enable CUDA compute provider support (requires CUDA 11.8, 12.x will not bring you joy and happiness)directml: enable DirectML compute provider support (entirely untested but theoretically works)download-binaries: downloadsherpa-onnxobject files instead of building it