Skip to content

Commit 71256c0

Browse files
authored
Update README.md
1 parent 3013af8 commit 71256c0

File tree

1 file changed

+65
-29
lines changed

1 file changed

+65
-29
lines changed

docs/whisper_transcription/README.md

Lines changed: 65 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -103,25 +103,6 @@ curl -X POST http://<YOUR_IP>:8000/transcribe \
103103
-F "max_speakers=2"
104104
```
105105

106-
### Start docker
107-
108-
```bash
109-
docker run --gpus all -p 8000:8000 whisper_api
110-
```
111-
112-
### Example Docker Call
113-
114-
```bash
115-
curl -X POST http://<YOUR_IP>:8000/transcribe \
116-
117-
-F "model=medium" \
118-
-F "summary=true" \
119-
-F "speaker=true" \
120-
-F "denoise=false" \
121-
-F "streaming=true" \
122-
-F "hf_token=hf_xxx" \
123-
-F "max_speakers=2"
124-
```
125106

126107
### Start Blueprint Deployment
127108
in the deploymen part of Blueprint, add a recipe suchas the following
@@ -130,7 +111,7 @@ in the deploymen part of Blueprint, add a recipe suchas the following
130111
"recipe_id": "whisper transcription",
131112
"recipe_mode": "service",
132113
"deployment_name": "whisper-transcription-a10",
133-
"recipe_image_uri": "iad.ocir.io/iduyx1qnmway/corrino-devops-repository:whisper_transcription_v2",
114+
"recipe_image_uri": "iad.ocir.io/iduyx1qnmway/corrino-devops-repository:whisper_transcription_v6",
134115
"recipe_node_shape": "VM.GPU.A10.2",
135116
"recipe_replica_count": 1,
136117
"recipe_container_port": "8000",
@@ -142,25 +123,80 @@ in the deploymen part of Blueprint, add a recipe suchas the following
142123
}
143124

144125
```
126+
#### Endpoint
127+
128+
```
129+
POST https://<YOUR_DEPLOYMENT>.nip.io/transcribe
130+
```
131+
132+
**Example:**
133+
```
134+
https://whisper-transcription-a10-6666.130-162-199-33.nip.io/transcribe
135+
```
136+
137+
---
138+
139+
#### Parameters
140+
141+
| Parameter | Type | Description |
142+
|----------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|
143+
| `audio_url` | `string` | URL to a Pre-Authenticated Request (PAR) of the audio file stored in OCI Object Storage. |
144+
| `model` | `string` | Whisper model name to use (`base`, `medium`, `turbo`, etc.). |
145+
| `summary` | `bool` | Whether to generate a summary at the end. If `true` and no custom model path is provided, `mistralai/Mistral-7B-Instruct-v0.1` will be loaded from Hugging Face. Requires `hf_token`. |
146+
| `speaker` | `bool` | Whether to enable speaker diarization. Requires `hf_token`. If `false`, all segments will be labeled as "Speaker 1". |
147+
| `max_speakers` | `int` | (Optional) Helps improve diarization accuracy by specifying the expected number of speakers. |
148+
| `denoise` | `bool` | (Optional) Apply basic denoising to improve quality in noisy recordings. |
149+
| `streaming` | `bool` | (Optional) Enable real-time log streaming for transcription chunks and progress updates. |
150+
| `hf_token` | `string` | Hugging Face token, required for loading models like Mistral or enabling speaker diarization. |
145151

146-
### Example Blueprint
152+
---
153+
154+
#### Example `curl` Command
147155

148156
```bash
149-
curl -L -X POST http://<YOUR_IP>:8000/transcribe \
150-
151-
-F "model=medium" \
157+
curl -k -N -L -X POST https://<YOUR_DEPLOYMENT>.nip.io/transcribe \
158+
-F "audio_url=<YOUR_PAR_URL>" \
159+
-F "model=turbo" \
152160
-F "summary=true" \
153161
-F "speaker=true" \
154-
-F "denoise=false" \
155162
-F "streaming=true" \
156-
-F "hf_token=hf_xxx" \
163+
-F "denoise=false" \
164+
-F "hf_token=hf_xxxxxxxxxxxxxxx" \
157165
-F "max_speakers=2"
158166
```
159-
In case you want to see the streaming, you need to :
167+
168+
---
169+
170+
#### Real-Time Log Streaming
171+
172+
If `streaming=true`, the API will return:
173+
174+
```json
175+
{
176+
"meta": "logfile_name",
177+
"logfile": "transcription_log_remote_audio_<timestamp>.log"
178+
}
179+
```
180+
181+
To stream logs in real-time (in another terminal):
182+
160183
```bash
161-
curl -N http://<YOUR_IP>:8000/stream_log/<YOUR_Log_Name>1.log
184+
curl -N https://<YOUR_DEPLOYMENT>.nip.io/stream_log/transcription_log_remote_audio_<timestamp>.log
162185
```
163-
once running the curl command you can see the log name printed
186+
187+
**Example:**
188+
189+
```bash
190+
curl -N https://whisper-transcription-a10-6666.130-162-199-33.nip.io/stream_log/transcription_log_remote_audio_20250604_020250.log
191+
```
192+
193+
This shows chunk-wise transcription output live, followed by the summary at the end.
194+
195+
---
196+
197+
#### Non-Streaming Mode
198+
199+
If `streaming=false`, the API will return the entire transcription (and summary if requested) in a single JSON response when processing is complete.
164200

165201
---
166202

0 commit comments

Comments
 (0)