1- # cohere_transcribe_rs
1+ # Cohere Transcribe in Rust
22
3- [ ![ CI] ( https://github.com/second-state/cohere_transcribe_rs/actions/workflows/ci.yml/badge.svg )] ( https://github.com/second-state/cohere_transcribe_rs/actions/workflows/ci.yml )
4- [ ![ Release] ( https://github.com/second-state/cohere_transcribe_rs/actions/workflows/release.yml/badge.svg )] ( https://github.com/second-state/cohere_transcribe_rs/releases )
5-
6- Transcribe speech to text using the
3+ Rust implementation for the
74[ CohereLabs/cohere-transcribe-03-2026] ( https://huggingface.co/CohereLabs/cohere-transcribe-03-2026 )
8- model — a fast Rust CLI and OpenAI-compatible API server with no Python or PyTorch
9- required at runtime.
5+ model. Includes a self-contained CLI and an OpenAI-compatible API server for AI agents.
106
117Supports English, French, German, Spanish, Italian, Portuguese, Dutch, Polish, Greek,
128Arabic, Japanese, Chinese, Vietnamese, and Korean.
@@ -83,162 +79,7 @@ The server is OpenAI Whisper API compatible — works with any OpenAI client lib
8379
8480---
8581
86- ## Backends
87-
88- Two compute backends are available — select one at compile time:
89-
90- | Backend | Platform | Feature flag | Accelerator |
91- | ---------| ----------| -------------| -------------|
92- | ** libtorch** (default) | Linux x86\_ 64, Linux aarch64 | ` --features tch-backend ` | CPU (BLAS-optimized) |
93- | ** MLX** | macOS Apple Silicon | ` --features mlx ` | Apple GPU (Metal) |
94-
95- Both backends produce identical output from the same weights.
96-
97- ---
98-
99- ## Requirements
100-
101- ** All platforms:**
102- - ** Rust** stable (1.70+) — install from [ rustup.rs] ( https://rustup.rs )
103- - ** 8 GB RAM** — the model weights expand to ~ 5.6 GB at runtime
104- - ** Python + sentencepiece** — one-time only, to extract ` vocab.json ` (Step 3)
105-
106- ** Linux (libtorch backend):**
107- - ** libtorch** C++ library — downloaded once, ~ 500 MB (see Step 1a)
108-
109- ** macOS Apple Silicon (MLX backend):**
110- - ** mlx-c** — C bindings for Apple MLX, built from source (see Step 1b)
111- - macOS 14+ with an M-series chip
112-
113- ---
114-
115- ## Setup
116-
117- ### Step 1a — Install libtorch (Linux only)
118-
119- Pick the build for your platform and extract it to ` /opt/libtorch ` .
120- This is the C++ library from PyTorch.org; no Python runtime is involved.
121-
122- ** Linux x86\_ 64:**
123- ``` bash
124- curl -Lo libtorch.zip \
125- ' https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-2.7.0%2Bcpu.zip'
126- sudo unzip libtorch.zip -d /opt
127- ```
128-
129- ** Linux ARM64** (AWS Graviton3, Ampere Altra — requires SVE support):
130- ``` bash
131- curl -Lo libtorch.tar.gz \
132- ' https://github.com/second-state/libtorch-releases/releases/download/v2.7.1/libtorch-cxx11-abi-aarch64-2.7.1.tar.gz'
133- sudo tar xzf libtorch.tar.gz -C /opt
134- ```
135-
136- Both commands produce ` /opt/libtorch/ ` . Set ` LIBTORCH=/your/path ` to use a different location.
137-
138- > ** Docker on macOS:** Extract libtorch to a native Linux path such as ` /opt/libtorch ` ,
139- > not onto the macOS volume mount (e.g. ` /Users/… ` ). The Linux linker cannot read
140- > large shared libraries through the virtiofs layer.
141-
142- ---
143-
144- ### Step 1b — Install mlx-c (macOS Apple Silicon only)
145-
146- The MLX backend links against [ mlx-c] ( https://github.com/ml-explore/mlx-c ) , the C API
147- wrapper for Apple's MLX framework.
148-
149- ``` bash
150- # 1. Install the MLX C++ library via Homebrew
151- brew install mlx
152-
153- # 2. Build and install mlx-c from source
154- git clone --depth 1 https://github.com/ml-explore/mlx-c.git /tmp/mlx-c
155- cmake -S /tmp/mlx-c -B /tmp/mlx-c/build \
156- -DCMAKE_BUILD_TYPE=Release \
157- -DCMAKE_PREFIX_PATH=" $( brew --prefix mlx) " \
158- -DCMAKE_INSTALL_PREFIX=/opt/mlx
159- cmake --build /tmp/mlx-c/build --parallel " $( sysctl -n hw.logicalcpu) "
160- sudo cmake --install /tmp/mlx-c/build
161- ```
162-
163- This produces ` /opt/mlx/lib/libmlxc.dylib ` and ` /opt/mlx/include/mlx/c/ ` .
164- Set ` MLX_DIR=/your/path ` to use a different install location.
165-
166- ---
167-
168- ### Step 2 — Download model weights
169-
170- ``` bash
171- pip install huggingface_hub
172- huggingface-cli download CohereLabs/cohere-transcribe-03-2026 \
173- --local-dir models/cohere-transcribe-03-2026
174- ```
175-
176- ---
177-
178- ### Step 3 — Extract the vocabulary (one time only)
179-
180- The model uses a SentencePiece tokenizer. Run this script once to produce ` vocab.json ` ,
181- which the Rust binary reads at runtime. Python is not needed after this step.
182-
183- ``` bash
184- pip install sentencepiece
185- python tools/extract_vocab.py --model_dir models/cohere-transcribe-03-2026
186- ```
187-
188- ---
189-
190- ### Step 4 — Build
191-
192- ** Linux (libtorch backend, default):**
193- ``` bash
194- LIBTORCH=/opt/libtorch cargo build --release
195- ```
196-
197- The ` LIBTORCH ` path is baked into the binary's RPATH by ` build.rs ` , so the binary
198- runs without ` LD_LIBRARY_PATH ` .
199-
200- ** macOS Apple Silicon (MLX backend):**
201- ``` bash
202- MLX_DIR=/opt/mlx cargo build --release --no-default-features --features mlx
203- ```
204-
205- The ` MLX_DIR ` path is similarly embedded as RPATH — no ` DYLD_LIBRARY_PATH ` at runtime.
206-
207- > ** Docker on macOS (Linux builds):** If the project source is on a macOS volume mount,
208- > set ` CARGO_TARGET_DIR ` to a native Linux path to prevent SIGBUS during compilation:
209- > ``` bash
210- > LIBTORCH=/opt/libtorch CARGO_TARGET_DIR=/tmp/cohere_target cargo build --release -j 1
211- > ` ` `
212-
213- The binary is written to ` target/release/transcribe`
214- (or ` $CARGO_TARGET_DIR /release/transcribe` if you set that variable).
215-
216- ---
217-
218- # # Running
219-
220- The binary has the library path baked into its RPATH at build time, so no environment
221- variables are needed at runtime — just run it directly:
222-
223- ` ` ` bash
224- ./target/release/transcribe --model-dir models/cohere-transcribe-03-2026 recording.wav
225- ` ` `
226-
227- This works for both backends (Linux/libtorch and macOS/MLX) as long as you built with
228- the correct ` LIBTORCH=` or ` MLX_DIR=` path and haven' t moved the library since building.
229-
230- If you have moved libtorch or need to override the path, the `transcribe.sh` wrapper sets
231- `LD_LIBRARY_PATH` as a fallback:
232-
233- ```bash
234- ./transcribe.sh --model-dir models/cohere-transcribe-03-2026 recording.wav
235- ```
236-
237- The wrapper searches for libtorch in: `$LIBTORCH` → `/opt/libtorch` → `./libtorch`.
238-
239- ---
240-
241- ## Options
82+ ## CLI Reference
24283
24384```
24485USAGE:
@@ -275,6 +116,22 @@ Audio is automatically converted to 16 kHz mono.
275116Files longer than ~ 35 seconds are split into overlapping chunks (5 s overlap)
276117and the results joined automatically.
277118
119+ ### Examples
120+
121+ ``` bash
122+ # Transcribe a single file
123+ ./transcribe -m models/cohere-transcribe-03-2026 interview.mp3
124+
125+ # French, no punctuation
126+ ./transcribe -m models/cohere-transcribe-03-2026 --language fr --no-punctuation speech.wav
127+
128+ # Multiple files — prints filename before each transcript
129+ ./transcribe -m models/cohere-transcribe-03-2026 call1.wav call2.wav call3.flac
130+
131+ # Show model loading progress
132+ ./transcribe -m models/cohere-transcribe-03-2026 -v audio.wav
133+ ```
134+
278135---
279136
280137## API Server
@@ -285,15 +142,10 @@ It serves the same model as the CLI and is a drop-in replacement for the OpenAI
285142### Start the server
286143
287144``` bash
288- # Linux (tch-backend)
289- ./target/release/transcribe-server \
145+ ./transcribe-server \
290146 --model-dir models/cohere-transcribe-03-2026 \
291147 --host 0.0.0.0 \
292148 --port 8080
293-
294- # macOS (MLX backend) — same binary, no extra flags needed
295- ./target/release/transcribe-server \
296- --model-dir models/cohere-transcribe-03-2026
297149```
298150
299151The server loads the model at startup (~ 30–90 s depending on hardware), then prints:
@@ -405,36 +257,116 @@ OPTIONS:
405257
406258---
407259
408- ## Examples
260+ ## Build from Source
261+
262+ ### Backends
263+
264+ Two compute backends are available — select one at compile time:
409265
266+ | Backend | Platform | Feature flag | Accelerator |
267+ | ---------| ----------| -------------| -------------|
268+ | ** libtorch** (default) | Linux x86\_ 64, Linux aarch64 | ` --features tch-backend ` | CPU (BLAS-optimized) |
269+ | ** MLX** | macOS Apple Silicon | ` --features mlx ` | Apple GPU (Metal) |
270+
271+ Both backends produce identical output from the same weights.
272+
273+ ### Requirements
274+
275+ ** All platforms:**
276+ - ** Rust** stable (1.70+) — install from [ rustup.rs] ( https://rustup.rs )
277+ - ** 8 GB RAM** — the model weights expand to ~ 5.6 GB at runtime
278+ - ** Python + sentencepiece** — one-time only, to extract ` vocab.json `
279+
280+ ** Linux (libtorch backend):**
281+ - ** libtorch** C++ library — downloaded once, ~ 500 MB
282+
283+ ** macOS Apple Silicon (MLX backend):**
284+ - macOS 14+ with an M-series chip
285+ - mlx-c is built automatically from the git submodule by ` build.rs `
286+
287+ ### Step 1 — Install libtorch (Linux only)
288+
289+ Pick the build for your platform and extract it to ` /opt/libtorch ` .
290+ This is the C++ library from PyTorch.org; no Python runtime is involved.
291+
292+ ** Linux x86\_ 64:**
410293``` bash
411- # Transcribe a single file
412- ./transcribe.sh -m models/cohere-transcribe-03-2026 interview.mp3
294+ curl -Lo libtorch.zip \
295+ ' https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-2.7.0%2Bcpu.zip'
296+ sudo unzip libtorch.zip -d /opt
297+ ```
413298
414- # French, no punctuation
415- ./transcribe.sh -m models/cohere-transcribe-03-2026 --language fr --no-punctuation speech.wav
299+ ** Linux ARM64** (AWS Graviton3, Ampere Altra — requires SVE support):
300+ ``` bash
301+ curl -Lo libtorch.tar.gz \
302+ ' https://github.com/second-state/libtorch-releases/releases/download/v2.7.1/libtorch-cxx11-abi-aarch64-2.7.1.tar.gz'
303+ sudo tar xzf libtorch.tar.gz -C /opt
304+ ```
416305
417- # Multiple files — prints filename before each transcript
418- ./transcribe.sh -m models/cohere-transcribe-03-2026 call1.wav call2.wav call3.flac
306+ Both commands produce ` /opt/libtorch/ ` . Set ` LIBTORCH=/your/path ` to use a different location.
419307
420- # Show model loading progress
421- ./transcribe.sh -m models/cohere-transcribe-03-2026 -v audio.wav
308+ > ** Docker on macOS:** Extract libtorch to a native Linux path such as ` /opt/libtorch ` ,
309+ > not onto the macOS volume mount (e.g. ` /Users/… ` ). The Linux linker cannot read
310+ > large shared libraries through the virtiofs layer.
311+
312+ ### Step 2 — Download model weights
313+
314+ ``` bash
315+ pip install huggingface_hub
316+ huggingface-cli download CohereLabs/cohere-transcribe-03-2026 \
317+ --local-dir models/cohere-transcribe-03-2026
318+ ```
319+
320+ ### Step 3 — Extract the vocabulary (one time only)
321+
322+ The model uses a SentencePiece tokenizer. Run this script once to produce ` vocab.json ` ,
323+ which the Rust binary reads at runtime. Python is not needed after this step.
324+
325+ ``` bash
326+ pip install sentencepiece
327+ python tools/extract_vocab.py --model_dir models/cohere-transcribe-03-2026
422328```
423329
330+ ### Step 4 — Build
331+
332+ ** Linux (libtorch backend, default):**
333+ ``` bash
334+ LIBTORCH=/opt/libtorch cargo build --release
335+ ```
336+
337+ The ` LIBTORCH ` path is baked into the binary's RPATH by ` build.rs ` , so the binary
338+ runs without ` LD_LIBRARY_PATH ` .
339+
340+ ** macOS Apple Silicon (MLX backend):**
341+ ``` bash
342+ git submodule update --init --recursive
343+ cargo build --release --no-default-features --features mlx
344+ ```
345+
346+ > ** Docker on macOS (Linux builds):** If the project source is on a macOS volume mount,
347+ > set ` CARGO_TARGET_DIR ` to a native Linux path to prevent SIGBUS during compilation:
348+ > ``` bash
349+ > LIBTORCH=/opt/libtorch CARGO_TARGET_DIR=/tmp/cohere_target cargo build --release -j 1
350+ > ` ` `
351+
352+ # ## Step 5 — Run
353+
354+ ` ` ` bash
355+ ./target/release/transcribe --model-dir models/cohere-transcribe-03-2026 recording.wav
356+ ```
357+
358+ No environment variables needed — RPATH is baked in at build time.
359+
424360---
425361
426362## Troubleshooting
427363
428364** ` libtorch not found ` ** (Linux)
429- Set `LIBTORCH=/path/to/libtorch` before running `transcribe.sh`, or install to `/opt/libtorch`.
430-
431- **`libmlxc.dylib not found`** (macOS)
432- The library path is baked into the binary RPATH at build time. If you moved mlx-c
433- after building, rebuild with `MLX_DIR=/new/path cargo build …`. As a temporary
434- workaround: `DYLD_LIBRARY_PATH=/opt/mlx/lib:$(brew --prefix mlx)/lib ./target/release/transcribe …`.
365+ Set ` LIBTORCH=/path/to/libtorch ` before building, or install to ` /opt/libtorch ` .
435366
436367** ` Missing required file 'vocab.json' ` **
437- Run `python tools/extract_vocab.py --model_dir <model_dir>` (Step 3).
368+ Run ` python tools/extract_vocab.py --model_dir <model_dir> ` , or copy ` vocab.json `
369+ from the release zip into the model directory.
438370
439371** Process killed immediately (exit 137)**
440372Out of memory. The model needs ~ 5.6 GB of RAM. Check with ` free -h ` (Linux) or
@@ -448,8 +380,4 @@ or use the macOS MLX backend on Apple hardware.
448380
449381** ` ELF section name out of range ` at link time** (Linux)
450382libtorch is on a Docker volume-mounted macOS path. Move it to a native Linux
451- path such as `/opt/libtorch` (see Step 1a note).
452-
453- **MLX build fails: `mlxc` not found**
454- Ensure you ran Step 1b and set `MLX_DIR=/opt/mlx` (or wherever you installed mlx-c).
455- Verify with `ls /opt/mlx/lib/libmlxc.dylib`.
383+ path such as ` /opt/libtorch ` .
0 commit comments