Skip to content

Commit d57888b

Browse files
tgenaitaybene2k1nerda-codes
authored
feat(ifr): support pixtral (#3726)
* feat(ifr): support pixtral * feat(ifr): supported gpus and quant * docs(ai): add navigation * feat(ifr): corrected * feat(ifr): fixed per review Co-authored-by: nerda-codes <[email protected]> * feat(ifr): context * feat(ifr): changed model name * feat(ifr): typo * feat(ifr): revised compatible instances for consistency * feat(ai): format file * feat(ai): down to 12 images --------- Co-authored-by: Benedikt Rollik <[email protected]> Co-authored-by: nerda-codes <[email protected]>
1 parent 72bb5d5 commit d57888b

File tree

9 files changed

+193
-12
lines changed

9 files changed

+193
-12
lines changed

ai-data/managed-inference/reference-content/llama-3-8b-instruct.mdx

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ categories:
1818
|-----------------|------------------------------------|
1919
| Provider | [Meta](https://llama.meta.com/llama3/) |
2020
| Model Name | `llama-3-8b-instruct` |
21-
| Compatible Instances | L4, H100 |
21+
| Compatible Instances | L4, H100 (FP8, BF16) |
2222
| Context size | 8192 tokens |
2323

2424
## Model names
@@ -30,8 +30,12 @@ meta/llama-3-8b-instruct:fp8
3030

3131
## Compatible Instances
3232

33-
- [L4](https://www.scaleway.com/en/l4-gpu-instance/)
34-
- [H100](https://www.scaleway.com/en/h100-pcie-try-it-now/)
33+
## Compatible Instances
34+
35+
| Instance type | Max context length |
36+
| ------------- |-------------|
37+
| L4 | 8192 (FP8, BF16) |
38+
| H100 | 8192 (FP8, BF16)
3539

3640
## Model introduction
3741

ai-data/managed-inference/reference-content/llama-3.1-70b-instruct.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ categories:
1919
| Provider | [Meta](https://llama.meta.com/llama3/) |
2020
| License | [Llama 3.1 community](https://llama.meta.com/llama3_1/license/) |
2121
| Model Name | `llama-3.1-70b-instruct` |
22-
| Compatible Instances | H100, H100-2 |
22+
| Compatible Instances | H100 (FP8), H100-2 (FP8, BF16) |
2323
| Context Length | up to 128k tokens |
2424

2525
## Model names

ai-data/managed-inference/reference-content/llama-3.1-8b-instruct.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ categories:
1919
| Provider | [Meta](https://llama.meta.com/llama3/) |
2020
| License | [Llama 3.1 community](https://llama.meta.com/llama3_1/license/) |
2121
| Model Name | `llama-3.1-8b-instruct` |
22-
| Compatible Instances | L4, H100, H100-2 |
22+
| Compatible Instances | L4, H100, H100-2 (FP8, BF16) |
2323
| Context Length | up to 128k tokens |
2424

2525
## Model names

ai-data/managed-inference/reference-content/mistral-7b-instruct-v0.3.mdx

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,11 @@ categories:
2727
mistral-7b-instruct-v0.3:bf16
2828
```
2929

30-
## Compatible Instance
30+
## Compatible Instances
3131

32-
- [L4 (BF16)](https://www.scaleway.com/en/l4-gpu-instance/)
32+
| Instance type | Max context length |
33+
| ------------- |-------------|
34+
| L4 | 32k (BF16)
3335

3436
## Model introduction
3537

ai-data/managed-inference/reference-content/mistral-nemo-instruct-2407.mdx

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,11 @@ categories:
2727
mistral-nemo-instruct-2407:fp8
2828
```
2929

30-
## Compatible Instance
30+
## Compatible Instances
3131

32-
- [H100 (FP8)](https://www.scaleway.com/en/h100-pcie-try-it-now/)
32+
| Instance type | Max context length |
33+
| ------------- |-------------|
34+
| H100 | 128k (FP8)
3335

3436
## Model introduction
3537

ai-data/managed-inference/reference-content/mixtral-8x7b-instruct-v0.1.mdx

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,10 @@ mistral/mixtral-8x7b-instruct-v0.1:fp16
3030

3131
## Compatible Instances
3232

33-
- [H100-1 (FP8)](https://www.scaleway.com/en/h100-pcie-try-it-now/)
34-
- [H100-2 (FP16)](https://www.scaleway.com/en/h100-pcie-try-it-now/)
33+
| Instance type | Max context length |
34+
| ------------- |-------------|
35+
| H100 | 32k (FP8)
36+
| H100-2 | 32k (FP16)
3537

3638
## Model introduction
3739

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
---
2+
meta:
3+
title: Understanding the Pixtral-12b-2409 model
4+
description: Deploy your own secure Pixtral-12b-2409 model with Scaleway Managed Inference. Privacy-focused, fully managed.
5+
content:
6+
h1: Understanding the Pixtral-12b-2409 model
7+
paragraph: This page provides information on the Pixtral-12b-2409 model
8+
tags:
9+
dates:
10+
validation: 2024-09-23
11+
categories:
12+
- ai-data
13+
---
14+
15+
## Model overview
16+
17+
| Attribute | Details |
18+
|-----------------|------------------------------------|
19+
| Provider | [Mistral](https://mistral.ai/technology/#models) |
20+
| Model Name | `pixtral-12b-2409` |
21+
| Compatible Instances | H100, H100-2 (bf16) |
22+
| Context size | 128k tokens |
23+
24+
## Model name
25+
26+
```bash
27+
mistral/pixtral-12b-2409:bf16
28+
```
29+
30+
## Compatible Instances
31+
32+
| Instance type | Max context length |
33+
| ------------- |-------------|
34+
| H100 | 128k (BF16)
35+
| H100-2 | 128k (BF16)
36+
37+
## Model introduction
38+
39+
Pixtral is a vision language model introducing a novel architecture: 12B parameter multimodal decoder plus 400M parameter vision encoder.
40+
It can analyze images and offer insights from visual content alongside text.
41+
This multimodal functionality creates new opportunities for applications that need both visual and textual comprehension.
42+
43+
Pixtral is open-weight and distributed under the Apache 2.0 license.
44+
45+
## Why is it useful?
46+
47+
- Pixtral allows you to process real world and high resolution images, unlocking capacities such as transcribing handwritten files or payment receipts, extracting information from graphs, captioning images, etc.
48+
- It offers large context window of up to 128k tokens, particularly useful for RAG applications
49+
- Pixtral supports variable image sizes and types: PNG (.png), JPEG (.jpeg and .jpg), WEBP (.webp), as well as non-animated GIF with only one frame (.gif)
50+
51+
<Message type="note">
52+
Pixtral 12B can understand and analyze images, not generate them. You will use it through the /v1/chat/completions endpoint.
53+
</Message>
54+
55+
## How to use it
56+
57+
### Sending Inference requests
58+
59+
<Message type="tip">
60+
Unlike previous Mistral models, Pixtral can take an `image_url` in the content array.
61+
</Message>
62+
63+
To perform inference tasks with your Pixtral model deployed at Scaleway, use the following command:
64+
65+
```bash
66+
curl -s \
67+
-H "Authorization: Bearer <IAM API key>" \
68+
-H "Content-Type: application/json" \
69+
--request POST \
70+
--url "https://<Deployment UUID>.ifr.fr-par.scw.cloud/v1/chat/completions" \
71+
--data '{
72+
"model": "mistral/pixtral-12b-2409:bf16",
73+
"messages": [
74+
{
75+
"role": "user",
76+
"content": [
77+
{"type" : "text", "text": "Describe this image in detail please."},
78+
{"type": "image_url", "image_url": {"url": "https://picsum.photos/id/32/512/512"}},
79+
{"type" : "text", "text": "and this one as well."},
80+
{"type": "image_url", "image_url": {"url": "https://www.wolframcloud.com/obj/resourcesystem/images/a0e/a0ee3983-46c6-4c92-b85d-059044639928/6af8cfb971db031b.png"}}
81+
]
82+
}
83+
],
84+
"top_p": 1,
85+
"temperature": 0.7,
86+
"stream": false
87+
}'
88+
```
89+
90+
Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting.
91+
92+
<Message type="tip">
93+
The model name allows Scaleway to put your prompts in the expected format.
94+
</Message>
95+
96+
<Message type="note">
97+
Ensure that the `messages` array is properly formatted with roles (system, user, assistant) and content.
98+
</Message>
99+
100+
### Passing images to Pixtral
101+
102+
1. Image URLs
103+
If the image is available online, you can just include the image URL in your request as demonstrated above. This approach is simple and does not require any encoding.
104+
105+
2. Base64 encoded image
106+
Base64 encoding is a standard way to transform binary data, like images, into a text format, making it easier to transmit over the internet.
107+
108+
The following Python code sample shows you how to encode an image in base64 format and pass it to your request payload.
109+
110+
111+
```python
112+
import base64
113+
from io import BytesIO
114+
from PIL import Image
115+
116+
def encode_image(img):
117+
buffered = BytesIO()
118+
img.save(buffered, format="JPEG")
119+
encoded_string = base64.b64encode(buffered.getvalue()).decode("utf-8")
120+
return encoded_string
121+
122+
img = Image.open("path_to_your_image.jpg")
123+
base64_img = encode_image(img)
124+
125+
payload = {
126+
"messages": [
127+
{
128+
"role": "user",
129+
"content": [
130+
{"type": "text", "text": "What is this image?"},
131+
{
132+
"type": "image_url",
133+
"image_url": {"url": f"data:image/jpeg;base64,{base64_img}"},
134+
},
135+
],
136+
}
137+
],
138+
... # other parameters
139+
}
140+
141+
```
142+
143+
### Receiving Managed Inference responses
144+
145+
Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
146+
Process the output data according to your application's needs. The response will contain the output generated by the visual language model based on the input provided in the request.
147+
148+
<Message type="note">
149+
Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/ai-data/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
150+
</Message>
151+
152+
## Frequently Asked Questions
153+
154+
#### What types of images are supported by Pixtral?
155+
- Bitmap (or raster) image formats, meaning storing images as grids of individual pixels, are supported: PNG, JPEG, WEBP, and non-animated GIFs in particular.
156+
- Vector image formats (SVG, PSD) are not supported.
157+
158+
#### Are other files supported?
159+
Only bitmaps can be analyzed by Pixtral, PDFs and videos are not supported.
160+
161+
#### Is there a limit to the size of each image?
162+
The only limitation is in context window (1 token for each 16x16 pixel).
163+
164+
#### What is the maximum amount of images per conversation?
165+
One conversation can handle up to 12 images (per request). The 13rd will return a 413 error.

ai-data/managed-inference/reference-content/sentence-t5-xxl.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,9 @@ sentence-transformers/sentence-t5-xxl:fp32
2727

2828
## Compatible Instances
2929

30-
- [L4 (FP32)](https://www.scaleway.com/en/l4-gpu-instance/)
30+
| Instance type | Max context length |
31+
| ------------- |-------------|
32+
| L4 | 512 (FP32) |
3133

3234
## Model introduction
3335

menu/navigation.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -626,6 +626,10 @@
626626
{
627627
"label": "Sentence-t5-xxl model",
628628
"slug": "sentence-t5-xxl"
629+
},
630+
{
631+
"label": "Pixtral-12b-2409 model",
632+
"slug": "pixtral-12b-2409"
629633
}
630634
],
631635
"label": "Additional Content",

0 commit comments

Comments
 (0)