Skip to content

Commit 9ac06ee

Browse files
committed
feat(genapis): add audio transcriptions api
1 parent 5ad8732 commit 9ac06ee

File tree

1 file changed

+18
-16
lines changed

1 file changed

+18
-16
lines changed

pages/generative-apis/how-to/query-audio-models.mdx

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Scaleway's Generative APIs service allows users to interact with powerful audio
1212

1313
There are several ways to interact with audio models:
1414
- The Scaleway [console](https://console.scaleway.com) provides a complete [playground](/generative-apis/how-to/query-language-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time.
15-
- Via the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) or the [Audio Transcriptions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-TODO)
15+
- Via the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) or the [Audio Transcriptions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-audio-create-an-audio-transcription)
1616

1717
<Requirements />
1818

@@ -21,7 +21,7 @@ There are several ways to interact with audio models:
2121
- A valid [API key](/iam/how-to/create-api-keys/) for API authentication
2222
- Python 3.7+ installed on your system
2323

24-
## Accessing the Playground
24+
## Accessing the playground
2525

2626
Scaleway provides a web playground for instruct-based models hosted on Generative APIs.
2727

@@ -48,14 +48,14 @@ In the example that follows, we will use the OpenAI Python client.
4848

4949
### Chat Completions API or Audio Transcriptions API?
5050

51-
Both the [Chat Completions API](TODO) and the [Audio Transcriptions API](TODO) are OpenAI-compatible REST APIs that accept audio input.
51+
Both the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) and the [Audio Transcriptions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-audio-create-an-audio-transcription) are OpenAI-compatible REST APIs that accept audio input.
5252

53-
The **Chat Completions API** is more suitable when transcribing audio input is part of a broader task, rather than pure transcription. Examples could include building a voice chat assistant which listens and responds in natural language, or sending multiple inputs (audio and text) to be interpreted and commented on. This API can be used with compatible multimodal models, such as `voxtral-small-24b`.
53+
The **Chat Completions API** is more suitable when transcribing audio input is part of a broader task, rather than a pure transcription task. For example, building a voice chat assistant which listens and responds in natural language, or sending multiple inputs (audio and text) to be interpreted. This API can be used for audio tasks with compatible multimodal models, such as `voxtral-small-24b`.
5454

5555
The **Audio Transcriptions API** is designed for pure speech-to-text (audio transcription) tasks, such as transcribing a voice note or meeting recording file. It can be used with compatible audio models, such as `whisper-large-v3`.
5656

5757
<Message type="note">
58-
Scaleway's support for the Audio Transcriptions API is currently at beta stage. TODO CHECK: incremental support of feature set?
58+
Scaleway's support for the Audio Transcriptions API is currently at beta stage. Support of the full feature set will be incremental.
5959
</Message>
6060

6161
For full details on the differences between these APIs, see the [official OpenAI documentation](https://platform.openai.com/docs/guides/audio#choosing-the-right-api).
@@ -88,28 +88,30 @@ You can now generate a text transcription of a given audio file using a suitable
8888

8989
<Tabs id="transcribing-audio">
9090

91-
<TabsTab label="Audio Transcriptions API">
91+
<TabsTab label="Audio Transcriptions API (Beta)">
92+
93+
<Message type="note">
94+
The Audio Transcriptions API expects audio files to be found locally. It does not support passing the URL of a remote audio file.
95+
</Message>
9296

93-
from openai import OpenAI
94-
import os
95-
96-
client = OpenAI(
97-
base_url="https://aa2cee79-0e20-4515-8ec0-0a8084dfbd9e.ifr.fr-par.scaleway.com/v1",
98-
api_key=os.getenv("SCW_SECRET_KEY") # Your unique API secret key from Scaleway
99-
)
97+
In the example below, a local audio file [scaleway-ai-revolution.mp3](https://genapi-documentation-assets.s3.fr-par.scw.cloud/scaleway-ai-revolution.mp3) is sent to the model. The resulting text transcription is printed to the screen.
10098

99+
```python
101100
MODEL = "openai/whisper-large-v3:fp16"
102-
AUDIO = 'interview-jbk-62s.mp3'
101+
AUDIO = 'scaleway-ai-revolution.mp3'
103102

104103
audio_file = open(AUDIO, "rb")
105104

106105
response = client.audio.transcriptions.create(
107106
model=MODEL,
108107
file=audio_file,
109-
language='fr'
108+
language='en'
110109
)
111110

112111
print(response.text)
112+
```
113+
114+
See the [dedicated API documentation](https://www.scaleway.com/en/developers/api/generative-apis/#path-audio-create-an-audio-transcription) for a full list of all available parameters.
113115

114116
</TabsTab>
115117

@@ -161,7 +163,7 @@ You can now generate a text transcription of a given audio file using a suitable
161163
print(response.choices[0].message.content)
162164
```
163165

164-
Various parameters such as `temperature` and `max_tokens` control the output. See the [dedicated API documentation](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) for a full list of all available parameters.
166+
See the [dedicated API documentation](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-an) for a full list of all available parameters.
165167

166168
#### Transcribing a local audio file
167169

0 commit comments

Comments
 (0)