Skip to content

Commit 7bdddeb

Browse files
committed
fix(ai): finish updating for responses
1 parent 78ef48c commit 7bdddeb

File tree

6 files changed

+226
-504
lines changed

6 files changed

+226
-504
lines changed

macros/ai/chat-comp-vs-responses-api.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,6 @@ Both the [Chat Completions API](https://www.scaleway.com/en/developers/api/gener
66

77
The **Chat Completions** API was released in 2023, and is an industry standard for building AI applications, being specifically designed for handling multi-turn conversations. It is stateless, but allows users to manage conversation history by appending each new message to the ongoing conversation. It supports `function` tool-calling, where the developer defines a set of functions, which the model can decide whether to call when generating a response. If it decides to call one of these functions, it returns the function name and arguments, and the developer's own code must actually execute the function and feed the result back into the conversation for use by the model.
88

9-
The **Responses** API was released in 2025, and is designed to combine the simplicity of Chat Completions with the ability to do more agentic tasks and reasoning. It supports statefulness, being able to maintain context without needing to resend the entire conversation history. It offers tool-calling by built-in tools (e.g. web or file search) that the model is able to execute itself while generating a response, though currently only `function` tools are supported by Scaleway. Overall, Scaleway's support for the Responses API is currently at beta stage. All supported Generative API models can be used with Responses API, and note that for the `gtp-oss-120b` model, only the Responses API will allow you to access all of its features.
9+
The **Responses** API was released in 2025, and is designed to combine the simplicity of Chat Completions with the ability to do more agentic tasks and reasoning. It supports statefulness, being able to maintain context without needing to resend the entire conversation history. It offers tool-calling by built-in tools (e.g. web or file search) that the model is able to execute itself while generating a response, though currently only `function` tools are supported by Scaleway. Overall, **Scaleway's support for the Responses API is currently at beta stage**. All supported Generative API models can be used with Responses API, and note that for the `gtp-oss-120b` model, only the Responses API will allow you to access all of its features.
1010

1111
For full details on the differences between these APIs, see the [official OpenAI documentation](https://platform.openai.com/docs/guides/migrate-to-responses).

pages/generative-apis/how-to/query-language-models.mdx

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -82,15 +82,15 @@ You can now create a chat completion using either the Chat Completions or Respon
8282
model="llama-3.1-8b-instruct",
8383
messages=[{"role": "user", "content": "Describe a futuristic city with advanced technology and green energy solutions."}],
8484
temperature=0.2, # Adjusts creativity
85-
max_tokens=100, # Limits the length of the output
85+
max_completion_tokens=100, # Limits the length of the output
8686
top_p=0.7 # Controls diversity through nucleus sampling. You usually only need to use temperature.
8787
)
8888

8989
# Print the generated response
9090
print(response.choices[0].message.content)
9191
```
9292

93-
This code sends a message to the model and returns an answer based on your input. The `temperature`, `max_tokens`, and `top_p` parameters control the response's creativity, length, and diversity, respectively.
93+
This code sends a message to the model and returns an answer based on your input. The `temperature`, `max_completion_tokens`, and `top_p` parameters control the response's creativity, length, and diversity, respectively.
9494

9595
</TabsTab>
9696

@@ -129,6 +129,8 @@ A conversation style may include a default system prompt. You may set this promp
129129
]
130130
```
131131

132+
Adding such a system prompt can also help resolve issues if you receive responses such as `I'm not sure what tools are available to me. Can you please provide a library of tools that I can use to generate a response?`.
133+
132134
### Model parameters and their effects
133135

134136
The following parameters will influence the output of the model:
@@ -139,7 +141,7 @@ The following parameters will influence the output of the model:
139141

140142
- **`messages`**: A list of message objects that represent the conversation history. Each message should have a `role` (e.g., "system", "user", "assistant") and `content`.
141143
- **`temperature`**: Controls the output's randomness. Lower values (e.g., 0.2) make the output more deterministic, while higher values (e.g., 0.8) make it more creative.
142-
- **`max_tokens`**: The maximum number of tokens (words or parts of words) in the generated output.
144+
- **`max_completion_tokens`**: The maximum number of tokens (words or parts of words) in the generated output.
143145
- **`top_p`**: Recommended for advanced use cases only. You usually only need to use temperature. `top_p` controls the diversity of the output, using nucleus sampling, where the model considers the tokens with top probabilities until the cumulative probability reaches `top_p`.
144146
- **`stop`**: A string or list of strings where the model will stop generating further tokens. This is useful for controlling the end of the output.
145147

@@ -210,7 +212,7 @@ The service also supports asynchronous mode for any chat completion.
210212
)
211213

212214
async def main():
213-
stream = await client.chat.completions.create(
215+
stream = client.chat.completions.create(
214216
model="llama-3.1-8b-instruct",
215217
messages=[{
216218
"role": "user",
@@ -237,7 +239,7 @@ The service also supports asynchronous mode for any chat completion.
237239

238240
async def main():
239241
stream = await client.responses.create(
240-
model="llama-3.1-8b-instruct",
242+
model="gpt-oss-120b",
241243
input=[{
242244
"role": "user",
243245
"content": "Sing me a song"

pages/generative-apis/how-to/query-vision-models.mdx

Lines changed: 49 additions & 131 deletions
Original file line numberDiff line numberDiff line change
@@ -109,15 +109,10 @@ You can now create a chat completion:
109109
print(response.choices[0].message.content)
110110
```
111111
</TabsTab>
112-
<TabsTab label="Responses API">
112+
<TabsTab label="Responses API (Beta)">
113113
```python
114114
from openai import OpenAI
115115

116-
# Initialize the client with your base URL and API key
117-
client = OpenAI(
118-
base_url="https://api.scaleway.ai/v1", # Scaleway's Generative APIs service URL
119-
api_key="<SCW_SECRET_KEY>" # Your unique API secret key from Scaleway
120-
)
121116
# Create a chat completion using the 'mistral-small-3.2-24b-instruct-2506' model
122117
response = client.responses.create(
123118
model="mistral-small-3.2-24b-instruct-2506",
@@ -169,7 +164,7 @@ To encode Base64 images in Python, you first need to install `Pillow` library:
169164
pip install pillow
170165
```
171166

172-
Then, the following Python code sample shows you how to encode an image in Base64 format and pass it to your request payload:
167+
Then, the following Python code sample shows you how to encode an image in Base64 format and pass it to a request payload for the Chat Completions API:
173168

174169
```python
175170
import base64
@@ -207,9 +202,9 @@ payload = {
207202

208203
```
209204

210-
### Model parameters and their effects
205+
### Model parameters and their effects
211206

212-
The following parameters will influence the output of the model:
207+
When using the Chat Completions API, the following parameters will influence the output of the model:
213208

214209
- **`messages`**: A list of message objects that represent the conversation history. Each message should have a `role` (e.g., "system", "user", "assistant") and `content`. The content is an array that can contain text and/or image objects.
215210
- **`temperature`**: Controls the output's randomness. Lower values (e.g., 0.2) make the output more deterministic, while higher values (e.g., 0.8) make it more creative.
@@ -225,142 +220,65 @@ The following parameters will influence the output of the model:
225220

226221
By default, the outputs are returned to the client only after the generation process is complete. However, a common alternative is to stream the results back to the client as they are generated. This is particularly useful in chat applications, where it allows the client to view the results incrementally as each token is produced.
227222

228-
Examples are provided below:
229-
230-
<Tabs id="vision-streaming">
231-
<TabsTab label="Chat Completions API">
232-
```python
233-
from openai import OpenAI
234-
235-
client = OpenAI(
236-
base_url="https://api.scaleway.ai/v1", # Scaleway's Generative APIs service URL
237-
api_key="<SCW_API_KEY>" # Your unique API key from Scaleway
238-
)
239-
response = client.chat.completions.create(
240-
model="pixtral-12b-2409",
241-
messages=[{
242-
"role": "user",
243-
"content": [
244-
{"type": "text", "text": "What is this image?"},
245-
{"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}},
246-
]
247-
}],
248-
stream=True,
249-
)
250-
251-
for chunk in response:
252-
if chunk.choices and chunk.choices[0].delta.content:
253-
print(chunk.choices[0].delta.content, end="")
254-
```
255-
</TabsTab>
256-
<TabsTab label="Responses API">
223+
An example for the Chat Completions API is provided below:
257224

258-
```python
259-
from openai import OpenAI
225+
```python
226+
from openai import OpenAI
260227

261-
client = OpenAI(
262-
base_url="https://api.scaleway.ai/v1", # Scaleway's Generative APIs service URL
263-
api_key="<SCW_API_KEY>" # Your unique API key from Scaleway
264-
)
228+
client = OpenAI(
229+
base_url="https://api.scaleway.ai/v1", # Scaleway's Generative APIs service URL
230+
api_key="<SCW_API_KEY>" # Your unique API key from Scaleway
231+
)
232+
response = client.chat.completions.create(
233+
model="pixtral-12b-2409",
234+
messages=[{
235+
"role": "user",
236+
"content": [
237+
{"type": "text", "text": "What is this image?"},
238+
{"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}},
239+
]
240+
}],
241+
stream=True,
242+
)
265243

266-
# Stream a response from the vision model
267-
with client.responses.stream(
268-
model="pixtral-12b-2409",
269-
input=[
270-
{
271-
"role": "user",
272-
"content": [
273-
{"type": "text", "text": "What is this image?"},
274-
{"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}},
275-
]
276-
}
277-
]
278-
) as stream:
279-
for event in stream:
280-
# Print incremental text as it arrives
281-
if event.type == "response.output_text.delta":
282-
print(event.delta, end="")
283-
284-
# Optionally, get the final aggregated response
285-
final_response = stream.get_final_response()
286-
print("\nFinal output:\n", final_response.output_text)
244+
for chunk in response:
245+
if chunk.choices and chunk.choices[0].delta.content:
246+
print(chunk.choices[0].delta.content, end="")
287247
```
288-
</TabsTab>
289-
</Tabs>
290248

291249

292250
## Async
293251

294-
The service also supports asynchronous mode for any chat completion.
295-
296-
<Tabs id="vision-async">
297-
<TabsTab label="Chat Completions API">
298-
```python
252+
The service also supports asynchronous mode for any chat completion. An example for the Chat Completions API is provided below:
299253

300-
import asyncio
301-
from openai import AsyncOpenAI
302-
303-
client = AsyncOpenAI(
304-
base_url="https://api.scaleway.ai/v1", # Scaleway's Generative APIs service URL
305-
api_key="<SCW_API_KEY>" # Your unique API key from Scaleway
306-
)
254+
```python
307255

308-
async def main():
309-
stream = await client.chat.completions.create(
310-
model="pixtral-12b-2409",
311-
messages=[{
312-
"role": "user",
313-
"content": [
314-
{"type": "text", "text": "What is this image?"},
315-
{"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}},
316-
]
317-
}],
318-
stream=True,
319-
)
320-
async for chunk in stream:
321-
if chunk.choices and chunk.choices[0].delta.content:
322-
print(chunk.choices[0].delta.content, end="")
323-
324-
asyncio.run(main())
325-
```
326-
</TabsTab>
327-
<TabsTab label="Responses API">
328-
```python
329-
import asyncio
330-
from openai import AsyncOpenAI
256+
import asyncio
257+
from openai import AsyncOpenAI
331258

332-
client = AsyncOpenAI(
333-
base_url="https://api.scaleway.ai/v1", # Scaleway's Generative APIs service URL
334-
api_key="<SCW_API_KEY>" # Your unique API key from Scaleway
335-
)
259+
client = AsyncOpenAI(
260+
base_url="https://api.scaleway.ai/v1", # Scaleway's Generative APIs service URL
261+
api_key="<SCW_API_KEY>" # Your unique API key from Scaleway
262+
)
336263

337-
async def main():
338-
# Stream a response from the vision model
339-
async with client.responses.stream(
340-
model="pixtral-12b-2409",
341-
input=[
342-
{
343-
"role": "user",
344-
"content": [
345-
{"type": "text", "text": "What is this image?"},
346-
{"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}},
347-
]
348-
}
264+
async def main():
265+
stream = await client.chat.completions.create(
266+
model="pixtral-12b-2409",
267+
messages=[{
268+
"role": "user",
269+
"content": [
270+
{"type": "text", "text": "What is this image?"},
271+
{"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}},
349272
]
350-
) as stream:
351-
async for event in stream:
352-
# Print incremental text as it arrives
353-
if event.type == "response.output_text.delta":
354-
print(event.delta, end="")
355-
356-
# Optionally, get the final aggregated response
357-
final_response = await stream.get_final_response()
358-
print("\nFinal output:\n", final_response.output_text)
273+
}],
274+
stream=True,
275+
)
276+
async for chunk in stream:
277+
if chunk.choices and chunk.choices[0].delta.content:
278+
print(chunk.choices[0].delta.content, end="")
359279

360-
asyncio.run(main())
361-
```
362-
</TabsTab>
363-
</Tabs>
280+
asyncio.run(main())
281+
```
364282

365283
## Frequently Asked Questions
366284

0 commit comments

Comments
 (0)