fix(responses): remove from vision

RoRoJ · RoRoJ · commit 1eed45069220 · 2025-08-27T17:18:53.000+02:00
diff --git a/macros/ai/chat-comp-vs-responses-api.mdx b/macros/ai/chat-comp-vs-responses-api.mdx
@@ -4,8 +4,14 @@ macro: chat-comp-vs-responses-api
 
 Both the [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) and the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/#path-responses-beta-create-a-response) are OpenAI-compatible REST APIs that can be used for generating and manipulating conversations. The Chat Completions API is focused on generating conversational responses, while the Responses API is a more general REST API for chat, structured outputs, tool use, and multimodal inputs.
 
-The **Chat Completions** API was released in 2023, and is an industry standard for building AI applications, being specifically designed for handling multi-turn conversations. It is stateless, but allows users to manage conversation history by appending each new message to the ongoing conversation. Messages in the conversation can include text, images and audio extracts. The API also supports `function` tool-calling, where the developer defines a set of functions, which the model can decide whether to call when generating a response. If it decides to call one of these functions, it returns the function name and arguments, and the developer's own code must actually execute the function and feed the result back into the conversation for use by the model.
+The **Chat Completions** API was released in 2023, and is an industry standard for building AI applications, being specifically designed for handling multi-turn conversations. It is stateless, but allows users to manage conversation history by appending each new message to the ongoing conversation. Messages in the conversation can include text, images and audio extracts. The API supports `function` tool-calling, allowing developers to define functions that the model can choose to call. If it does so, it returns the function name and arguments, which the developer's code must execute and feed back into the conversation.
 
-The **Responses** API was released in 2025, and is designed to combine the simplicity of Chat Completions with the ability to do more agentic tasks and reasoning. It supports statefulness, being able to maintain context without needing to resend the entire conversation history. It offers tool-calling by built-in tools (e.g. web or file search) that the model is able to execute itself while generating a response, though currently only `function` tools are supported by Scaleway. Overall, **Scaleway's support for the Responses API is currently at beta stage and the support of the full features set will be incremental**. Most of the supported Generative API models can be used with Responses API, and note that for the **`gtp-oss-120b` model, the use of the Responses API is recommended** as it will allow you to access all of its features, especially tools calling. 
+The **Responses** API was released in 2025, and is designed to combine the simplicity of Chat Completions with the ability to do more agentic tasks and reasoning. It supports statefulness, being able to maintain context without needing to resend the entire conversation history. It offers tool-calling by built-in tools (e.g. web or file search) that the model is able to execute itself while generating a response. 
+
+<Message type="note">
+Scaleway's support for the Responses API is currently at beta stage. Support of the full feature set will be incremental: currently statefulness and tools other than `function` calling are not supported. 
+</Message>
+
+Most supported Generative API models can be used with both Chat Completions and Responses API. For the **`gtp-oss-120b` model, use of the Responses API is recommended, as it will allow you to access all of its features, especially tool-calling.
 
 For full details on the differences between these APIs, see the [official OpenAI documentation](https://platform.openai.com/docs/guides/migrate-to-responses).
diff --git a/pages/generative-apis/how-to/query-vision-models.mdx b/pages/generative-apis/how-to/query-vision-models.mdx
@@ -7,8 +7,6 @@ dates:
   posted: 2024-10-30
 ---
 import Requirements from '@macros/iam/requirements.mdx'
-import ChatCompVsResponsesApi from '@macros/ai/chat-comp-vs-responses-api.mdx'
-
 
 Scaleway's Generative APIs service allows users to interact with powerful vision models hosted on the platform.
 
@@ -18,7 +16,7 @@ Scaleway's Generative APIs service allows users to interact with powerful vision
 
 There are several ways to interact with vision models:
 - The Scaleway [console](https://console.scaleway.com) provides a complete [playground](/generative-apis/how-to/query-vision-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time.
-- [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion) or the [Responses API](https://www.scaleway.com/en/developers/api/generative-apis/#path-responses-beta-create-a-response)
+- The [Chat Completions API](https://www.scaleway.com/en/developers/api/generative-apis/#path-chat-completions-create-a-chat-completion).
 
 <Requirements />
 
@@ -54,10 +52,6 @@ In the example that follows, we will use the OpenAI Python client.
   Unlike traditional language models, vision models will take a content array for the user role, structuring text and images as inputs.
 </Message>
 
-### Chat Completions API or Responses API?
-
-<ChatCompVsResponsesApi />
-
 ### Installing the OpenAI SDK
 
 Install the OpenAI SDK using pip:
@@ -84,61 +78,28 @@ client = OpenAI(
 
 You can now create a chat completion:
 
-<Tabs id="vision-chat-completion">
-    <TabsTab label="Chat Completions API">
-    ```python
-    # Create a chat completion using the 'pixtral-12b-2409' model
-    response = client.chat.completions.create(
-        model="pixtral-12b-2409",
-        messages=[
-        {
-            "role": "user",
-            "content": [
-                {"type": "text", "text": "What is this image?"},
-                {"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}},
-            ] #  Vision models will take a content array with text and image_url objects.
-
-        }
-        ],
-        temperature=0.7,  # Adjusts creativity
-        max_tokens=2048,   # Limits the length of the output
-        top_p=0.9         # Controls diversity through nucleus sampling. You usually only need to use temperature.
-    )
-
-    # Print the generated response
-    print(response.choices[0].message.content)
-    ```
-    </TabsTab>
-    <TabsTab label="Responses API (Beta)">
-    ```python
-    from openai import OpenAI
-
-    # Create a chat completion using the 'mistral-small-3.2-24b-instruct-2506' model
-    response = client.responses.create(
-        model="mistral-small-3.2-24b-instruct-2506",
-        input=[
-        {
-            "role": "user",
-            "content": [
-                {"type": "input_text", "text": "What is this image?"},
-                {"type": "input_image",
-                "image_url": "https://picsum.photos/id/32/512/512",
-                "detail": "auto"}
-            ] #  Vision models will take a content array with text and image_url objects.
-
-        }
-        ],
-        temperature=0.7,  # Adjusts creativity
-        max_output_tokens=2048,   # Limits the length of the output
-        top_p=0.9         # Controls diversity through nucleus sampling. You usually only need to use temperature.
-    )
+```python
+# Create a chat completion using the 'pixtral-12b-2409' model
+response = client.chat.completions.create(
+    model="pixtral-12b-2409",
+    messages=[
+    {
+        "role": "user",
+        "content": [
+            {"type": "text", "text": "What is this image?"},
+            {"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}},
+        ] #  Vision models will take a content array with text and image_url objects.
+
+    }
+    ],
+    temperature=0.7,  # Adjusts creativity
+    max_tokens=2048,   # Limits the length of the output
+    top_p=0.9         # Controls diversity through nucleus sampling. You usually only need to use temperature.
+)
 
-    # Print the generated response. Here, the last output message will contain the final content.
-    # Previous outputs will contain reasoning content.
-    print(response.output[-1].content[0].text)
-    ```
-    </TabsTab>
-</Tabs>
+# Print the generated response
+print(response.choices[0].message.content)
+```
 
 This code sends messages, prompts and images, to the vision model and returns an answer based on your input. The `temperature`, `max_tokens`, and `top_p` parameters control the response's creativity, length, and diversity, respectively.