feat(ifr): support pixtral (#3726)

tgenaitay · bene2k1 · nerda-codes · web-flow · commit d57888b0e06f · 2024-09-25T08:54:28.000+02:00
* feat(ifr): support pixtral

* feat(ifr): supported gpus and quant

* docs(ai): add navigation

* feat(ifr): corrected

* feat(ifr): fixed per review

Co-authored-by: nerda-codes &lt;87707325+nerda-codes@users.noreply.github.com&gt;

* feat(ifr): context

* feat(ifr): changed model name

* feat(ifr): typo

* feat(ifr): revised compatible instances for consistency

* feat(ai): format file

* feat(ai): down to 12 images

---------

Co-authored-by: Benedikt Rollik &lt;brollik@online.net&gt;
Co-authored-by: nerda-codes &lt;87707325+nerda-codes@users.noreply.github.com&gt;
diff --git a/ai-data/managed-inference/reference-content/llama-3-8b-instruct.mdx b/ai-data/managed-inference/reference-content/llama-3-8b-instruct.mdx
@@ -18,7 +18,7 @@ categories:
 |-----------------|------------------------------------|
 | Provider        | [Meta](https://llama.meta.com/llama3/)  |
 | Model Name      | `llama-3-8b-instruct`                 |
-| Compatible Instances | L4, H100 |
+| Compatible Instances | L4, H100 (FP8, BF16) |
 | Context size | 8192 tokens    |
 
 ## Model names
@@ -30,8 +30,12 @@ meta/llama-3-8b-instruct:fp8
 
 ## Compatible Instances
 
-- [L4](https://www.scaleway.com/en/l4-gpu-instance/)
-- [H100](https://www.scaleway.com/en/h100-pcie-try-it-now/)
+## Compatible Instances
+
+| Instance type  | Max context length |
+| ------------- |-------------|
+| L4      | 8192 (FP8, BF16) | 
+| H100      | 8192 (FP8, BF16)
 
 ## Model introduction
 
diff --git a/ai-data/managed-inference/reference-content/llama-3.1-70b-instruct.mdx b/ai-data/managed-inference/reference-content/llama-3.1-70b-instruct.mdx
@@ -19,7 +19,7 @@ categories:
 | Provider        | [Meta](https://llama.meta.com/llama3/)  |
 | License        | [Llama 3.1 community](https://llama.meta.com/llama3_1/license/)  |
 | Model Name      | `llama-3.1-70b-instruct`                 |
-| Compatible Instances | H100, H100-2 |
+| Compatible Instances | H100 (FP8), H100-2 (FP8, BF16) |
 | Context Length | up to 128k tokens    |
 
 ## Model names
diff --git a/ai-data/managed-inference/reference-content/llama-3.1-8b-instruct.mdx b/ai-data/managed-inference/reference-content/llama-3.1-8b-instruct.mdx
@@ -19,7 +19,7 @@ categories:
 | Provider        | [Meta](https://llama.meta.com/llama3/)  |
 | License        | [Llama 3.1 community](https://llama.meta.com/llama3_1/license/)  |
 | Model Name      | `llama-3.1-8b-instruct`                 |
-| Compatible Instances | L4, H100, H100-2 |
+| Compatible Instances | L4, H100, H100-2 (FP8, BF16) |
 | Context Length | up to 128k tokens |
 
 ## Model names
diff --git a/ai-data/managed-inference/reference-content/mistral-7b-instruct-v0.3.mdx b/ai-data/managed-inference/reference-content/mistral-7b-instruct-v0.3.mdx
@@ -27,9 +27,11 @@ categories:
 mistral-7b-instruct-v0.3:bf16
 ```
 
-## Compatible Instance
+## Compatible Instances
 
-- [L4 (BF16)](https://www.scaleway.com/en/l4-gpu-instance/)
+| Instance type  | Max context length |
+| ------------- |-------------|
+| L4      | 32k (BF16)
 
 ## Model introduction
 
diff --git a/ai-data/managed-inference/reference-content/mistral-nemo-instruct-2407.mdx b/ai-data/managed-inference/reference-content/mistral-nemo-instruct-2407.mdx
@@ -27,9 +27,11 @@ categories:
 mistral-nemo-instruct-2407:fp8
 ```
 
-## Compatible Instance
+## Compatible Instances
 
-- [H100 (FP8)](https://www.scaleway.com/en/h100-pcie-try-it-now/)
+| Instance type  | Max context length |
+| ------------- |-------------|
+| H100      | 128k (FP8)
 
 ## Model introduction
 
diff --git a/ai-data/managed-inference/reference-content/mixtral-8x7b-instruct-v0.1.mdx b/ai-data/managed-inference/reference-content/mixtral-8x7b-instruct-v0.1.mdx
@@ -30,8 +30,10 @@ mistral/mixtral-8x7b-instruct-v0.1:fp16
 
 ## Compatible Instances
 
-- [H100-1 (FP8)](https://www.scaleway.com/en/h100-pcie-try-it-now/)
-- [H100-2 (FP16)](https://www.scaleway.com/en/h100-pcie-try-it-now/)
+| Instance type  | Max context length |
+| ------------- |-------------|
+| H100      | 32k (FP8)
+| H100-2      | 32k (FP16)
 
 ## Model introduction
 
diff --git a/ai-data/managed-inference/reference-content/pixtral-12b-2409.mdx b/ai-data/managed-inference/reference-content/pixtral-12b-2409.mdx
@@ -0,0 +1,165 @@
+---
+meta:
+  title: Understanding the Pixtral-12b-2409 model
+  description: Deploy your own secure Pixtral-12b-2409 model with Scaleway Managed Inference. Privacy-focused, fully managed.
+content:
+  h1:  Understanding the Pixtral-12b-2409 model
+  paragraph: This page provides information on the Pixtral-12b-2409 model
+tags: 
+dates:
+  validation: 2024-09-23
+categories:
+  - ai-data
+---
+
+## Model overview
+
+| Attribute       | Details                            |
+|-----------------|------------------------------------|
+| Provider        | [Mistral](https://mistral.ai/technology/#models)                         |
+| Model Name      | `pixtral-12b-2409`       |
+| Compatible Instances | H100, H100-2 (bf16)                 |
+| Context size | 128k tokens    |
+
+## Model name
+
+```bash
+mistral/pixtral-12b-2409:bf16
+```
+
+## Compatible Instances
+
+| Instance type  | Max context length |
+| ------------- |-------------|
+| H100      | 128k (BF16)
+| H100-2      | 128k (BF16)
+
+## Model introduction
+
+Pixtral is a vision language model introducing a novel architecture: 12B parameter multimodal decoder plus 400M parameter vision encoder. 
+It can analyze images and offer insights from visual content alongside text. 
+This multimodal functionality creates new opportunities for applications that need both visual and textual comprehension. 
+
+Pixtral is open-weight and distributed under the Apache 2.0 license.
+
+## Why is it useful?
+
+- Pixtral allows you to process real world and high resolution images, unlocking capacities such as transcribing handwritten files or payment receipts, extracting information from graphs, captioning images, etc.
+- It offers large context window of up to 128k tokens, particularly useful for RAG applications
+- Pixtral supports variable image sizes and types: PNG (.png), JPEG (.jpeg and .jpg), WEBP (.webp), as well as non-animated GIF with only one frame (.gif)
+
+<Message type="note">
+  Pixtral 12B can understand and analyze images, not generate them. You will use it through the /v1/chat/completions endpoint.
+</Message>
+
+## How to use it
+
+### Sending Inference requests
+
+<Message type="tip">
+  Unlike previous Mistral models, Pixtral can take an `image_url` in the content array.
+</Message>
+
+To perform inference tasks with your Pixtral model deployed at Scaleway, use the following command:
+
+```bash
+curl -s \
+-H "Authorization: Bearer <IAM API key>" \
+-H "Content-Type: application/json" \
+--request POST \
+--url "https://<Deployment UUID>.ifr.fr-par.scw.cloud/v1/chat/completions" \
+--data '{
+       "model": "mistral/pixtral-12b-2409:bf16",
+       "messages": [
+        {
+          "role": "user",
+          "content": [
+              {"type" : "text", "text": "Describe this image in detail please."},
+              {"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}},
+              {"type" : "text", "text": "and this one as well."},
+              {"type": "image_url", "image_url": {"url": "https://www.wolframcloud.com/obj/resourcesystem/images/a0e/a0ee3983-46c6-4c92-b85d-059044639928/6af8cfb971db031b.png"}}
+          ]
+        }
+       ],
+       "top_p": 1, 
+       "temperature": 0.7, 
+       "stream": false
+}'
+```
+
+Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting.
+
+<Message type="tip">
+  The model name allows Scaleway to put your prompts in the expected format.
+</Message>
+
+<Message type="note">
+  Ensure that the `messages` array is properly formatted with roles (system, user, assistant) and content.
+</Message>
+
+### Passing images to Pixtral
+
+1. Image URLs
+If the image is available online, you can just include the image URL in your request as demonstrated above. This approach is simple and does not require any encoding.
+
+2. Base64 encoded image
+Base64 encoding is a standard way to transform binary data, like images, into a text format, making it easier to transmit over the internet.
+
+The following Python code sample shows you how to encode an image in base64 format and pass it to your request payload.
+
+
+```python
+import base64
+from io import BytesIO
+from PIL import Image
+
+def encode_image(img):
+    buffered = BytesIO()
+    img.save(buffered, format="JPEG")
+    encoded_string = base64.b64encode(buffered.getvalue()).decode("utf-8")
+    return encoded_string
+
+img = Image.open("path_to_your_image.jpg")
+base64_img = encode_image(img)
+
+payload = {
+    "messages": [
+        {
+            "role": "user",
+            "content": [
+                {"type": "text", "text": "What is this image?"},
+                {
+                    "type": "image_url",
+                    "image_url": {"url": f"data:image/jpeg;base64,{base64_img}"},
+                },
+            ],
+        }
+    ],
+    ... # other parameters
+}
+
+```
+
+### Receiving Managed Inference responses
+
+Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server. 
+Process the output data according to your application's needs. The response will contain the output generated by the visual language model based on the input provided in the request.
+
+<Message type="note">
+  Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/ai-data/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
+</Message>
+
+## Frequently Asked Questions
+
+#### What types of images are supported by Pixtral?
+- Bitmap (or raster) image formats, meaning storing images as grids of individual pixels, are supported: PNG, JPEG, WEBP, and non-animated GIFs in particular.
+- Vector image formats (SVG, PSD) are not supported.
+
+#### Are other files supported?
+Only bitmaps can be analyzed by Pixtral, PDFs and videos are not supported.
+
+#### Is there a limit to the size of each image?
+The only limitation is in context window (1 token for each 16x16 pixel).
+
+#### What is the maximum amount of images per conversation?
+One conversation can handle up to 12 images (per request). The 13rd will return a 413 error.
diff --git a/ai-data/managed-inference/reference-content/sentence-t5-xxl.mdx b/ai-data/managed-inference/reference-content/sentence-t5-xxl.mdx
@@ -27,7 +27,9 @@ sentence-transformers/sentence-t5-xxl:fp32
 
 ## Compatible Instances
 
-- [L4 (FP32)](https://www.scaleway.com/en/l4-gpu-instance/)
+| Instance type  | Max context length |
+| ------------- |-------------|
+| L4      | 512 (FP32) | 
 
 ## Model introduction
 
diff --git a/menu/navigation.json b/menu/navigation.json
@@ -626,6 +626,10 @@
                   {
                     "label": "Sentence-t5-xxl model",
                     "slug": "sentence-t5-xxl"
+                  },
+                  {
+                    "label": "Pixtral-12b-2409 model",
+                    "slug": "pixtral-12b-2409"
                   }
                 ],
                 "label": "Additional Content",

Original file line number	Diff line number	Diff line change
`@@ -626,6 +626,10 @@`
`626`	`626`	`{`
`627`	`627`	`"label": "Sentence-t5-xxl model",`
`628`	`628`	`"slug": "sentence-t5-xxl"`
	`629`	`+ },`
	`630`	`+ {`
	`631`	`+ "label": "Pixtral-12b-2409 model",`
	`632`	`+ "slug": "pixtral-12b-2409"`
`629`	`633`	`}`
`630`	`634`	`],`
`631`	`635`	`"label": "Additional Content",`