File tree Expand file tree Collapse file tree 2 files changed +5
-7
lines changed
managed-inference/reference-content Expand file tree Collapse file tree 2 files changed +5
-7
lines changed Original file line number Diff line number Diff line change 11---
22title : How to query audio models
33description : Learn how to interact with powerful audio models using Scaleway's Generative APIs service.
4- tags : generative-apis ai-data audio-models voxtral audio-model
4+ tags : generative-apis ai-data audio-models voxtral
55dates :
66 validation : 2025-09-22
77 posted : 2025-09-22
88---
99import Requirements from ' @macros/iam/requirements.mdx'
10- import ChatCompVsResponsesApi from ' @macros/ai/chat-comp-vs-responses-api.mdx'
1110
1211Scaleway's Generative APIs service allows users to interact with powerful audio models hosted on the platform.
1312
@@ -39,7 +38,7 @@ The web playground displays.
39384 . Click ** View code** to get code snippets configured according to your settings in the playground.
4039
4140<Message type = " tip" >
42- You can also use the upload button to send supported audio file formats, such as MP3, to the model for transcription purposes.
41+ You can also use the upload button to send supported audio file formats, such as MP3, to audio models for transcription purposes.
4342</Message >
4443
4544## Querying audio models via API
@@ -163,7 +162,6 @@ response = client.chat.completions.create(
163162)
164163
165164print (response.choices[0 ].message.content)
166-
167165```
168166
169167Various parameters such as ` temperature ` and ` max_tokens ` control the output. See the [ dedicated API documentation] ( https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion ) for a full list of all available parameters.
Original file line number Diff line number Diff line change @@ -170,7 +170,7 @@ allenai/molmo-72b-0924:fp8
170170
171171### Voxtral-small-24b-2507
172172Voxtral-small-24b-2507 is a model developed by Mistral to perform text processing and audio analysis on many languages.
173- This model was optimized to enable transcription in many languages while keeping conversational capabilities (translations, classification.. .)
173+ This model was optimized to enable transcription in many languages while keeping conversational capabilities (translations, classification, etc .)
174174
175175| Attribute | Value |
176176| -----------| -------|
@@ -186,8 +186,8 @@ mistral/voxtral-small-24b-2507:fp8
186186```
187187
188188- Mono and stereo audio formats are supported. For stereo formats, both left and right channels are merged before being processed.
189- - Audio files are processed by 30 seconds chunks:
190- - If audio sent is less than 30 seconds, the rest of a chunk will be considered silent.
189+ - Audio files are processed in 30 seconds chunks:
190+ - If audio sent is less than 30 seconds, the rest of the chunk will be considered silent.
191191 - 80ms is equal to 1 input token
192192
193193## Text models
You can’t perform that action at this time.
0 commit comments