Skip to content

Commit ea2074e

Browse files
committed
[docs]: add tutorial for transcription v1 api
1 parent 7a8b95f commit ea2074e

File tree

1 file changed

+99
-0
lines changed

1 file changed

+99
-0
lines changed
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Tutorial: Whisper Transcription API in vLLM Production Stack
2+
3+
## Overview
4+
5+
This tutorial introduces the newly added `/v1/audio/transcriptions` endpoint in the `vllm-router`, enabling users to transcribe `.wav` audio files using OpenAI’s `whisper-small` model.
6+
7+
## Prerequisites
8+
9+
* Access to a machine with a GPU (e.g. via [RunPod](https://runpod.io/))
10+
* Python 3.12 environment (recommended with `uv`)
11+
* `vllm` and `production-stack` cloned and installed
12+
* `vllm` installed with audio support:
13+
14+
```bash
15+
pip install vllm[audio]
16+
```
17+
18+
## 1. Serving the Whisper Model
19+
20+
Start a vLLM backend with the `whisper-small` model:
21+
22+
```bash
23+
vllm serve \
24+
--task transcription openai/whisper-small \
25+
--host 0.0.0.0 --port 8002
26+
```
27+
28+
## 2. Running the Router
29+
30+
Create and run a router connected to the Whisper backend:
31+
32+
```bash
33+
#!/bin/bash
34+
if [[ $# -ne 2 ]]; then
35+
echo "Usage: $0 <router_port> <backend_url>"
36+
exit 1
37+
fi
38+
39+
uv run python3 -m vllm_router.app \
40+
--host 0.0.0.0 --port "$1" \
41+
--service-discovery static \
42+
--static-backends "$2" \
43+
--static-models "openai/whisper-small" \
44+
--static-model-types "transcription" \
45+
--routing-logic roundrobin \
46+
--log-stats \
47+
--engine-stats-interval 10 \
48+
--request-stats-window 10
49+
```
50+
51+
Example usage:
52+
53+
```bash
54+
./run-router.sh 8000 http://localhost:8002
55+
```
56+
57+
## 3. Sending a Transcription Request
58+
59+
Use `curl` to send a `.wav` file to the transcription endpoint:
60+
61+
* You can test with any `.wav` audio file of your choice.
62+
63+
```bash
64+
curl -v http://localhost:8000/v1/audio/transcriptions \
65+
-F 'file=@/path/to/audio.wav;type=audio/wav' \
66+
-F 'model=openai/whisper-small' \
67+
-F 'response_format=json' \
68+
-F 'language=en'
69+
```
70+
71+
### Supported Parameters
72+
73+
| Parameter | Description |
74+
| ----------------- | ------------------------------------------------------ |
75+
| `file` | Path to a `.wav` audio file |
76+
| `model` | Whisper model to use (e.g., `openai/whisper-small`) |
77+
| `prompt` | *(Optional)* Text prompt to guide the transcription |
78+
| `response_format` | One of `json`, `text`, `srt`, `verbose_json`, or `vtt` |
79+
| `temperature` | *(Optional)* Sampling temperature as a float |
80+
| `language` | ISO 639-1 code (e.g., `en`, `fr`, `zh`) |
81+
82+
## 4. Sample Output
83+
84+
```json
85+
{
86+
"text": "Testing testing testing the whisper small model testing testing testing the audio transcription function testing testing testing the whisper small model"
87+
}
88+
```
89+
90+
## 5. Notes
91+
92+
* Router uses extended HTTPX timeouts to support long transcription jobs.
93+
* This implementation dynamically discovers valid transcription backends and routes requests accordingly.
94+
95+
## 6. Resources
96+
97+
* [PR #469 – Add Whisper Transcription API](https://github.com/vllm-project/production-stack/pull/469)
98+
* [OpenAI Whisper GitHub](https://github.com/openai/whisper)
99+
* [Blog: vLLM Whisper Transcription Walkthrough](https://davidgao7.github.io/posts/vllm-v1-whisper-transcription/)

0 commit comments

Comments
 (0)