Skip to content

Commit c2304f3

Browse files
committed
feat(minfr): add chat models
1 parent 57a0ac2 commit c2304f3

File tree

1 file changed

+167
-28
lines changed

1 file changed

+167
-28
lines changed

pages/managed-inference/reference-content/supported-models.mdx

Lines changed: 167 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -10,35 +10,51 @@ dates:
1010
validation: 2025-04-08
1111
posted: 2025-04-08
1212
categories:
13-
- ai-data
13+
* ai-data
1414
---
1515

1616
Scaleway Managed Inference allows you to deploy various AI models, either from:
1717

18-
- [Scaleway catalog](#scaleway-catalog): A curated set of ready-to-deploy models available through the [Scaleway console](https://console.scaleway.com/inference/deployments/) or the [Managed Inference models API](https://www.scaleway.com/en/developers/api/inference/#path-models-list-models)
19-
- [Custom models](#custom-models): Models that you import, typically from sources like Hugging Face.
18+
* [Scaleway catalog](#scaleway-catalog): A curated set of ready-to-deploy models available through the [Scaleway console](https://console.scaleway.com/inference/deployments/) or the [Managed Inference models API](https://www.scaleway.com/en/developers/api/inference/#path-models-list-models)
19+
* [Custom models](#custom-models): Models that you import, typically from sources like Hugging Face.
2020

2121
## Scaleway catalog
2222

2323
### Multimodal models (chat + vision)
2424

2525
### Chat models
2626

27-
| Provider | Model identifier | Documentation | License |
28-
|----------|------------------|----------------|---------|
29-
| Meta | `llama-3.3-70b-instruct` | [View Details](https://www.scaleway.com/en/docs/managed-inference/reference-content/llama-3.3-70b-instruct/) | [Llama 3.3 License](https://www.llama.com/llama3_3/license/) |
30-
| Meta | `llama-3.1-8b-instruct` | [View Details](https://www.scaleway.com/en/docs/managed-inference/reference-content/llama-3.1-8b-instruct/) | [Llama 3.1 License](https://llama.meta.com/llama3_1/license/) |
27+
| Provider | Model identifier | Documentation | License |
28+
|------------|-----------------------------------|--------------------------------------------------------------------------|-------------------------------------------------------|
29+
| Allen AI | `molmo-72b-0924` | [View Details](/managed-inference/reference-content/molmo-72b-0924/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
30+
| Deepseek | `deepseek-r1-distill-llama-70b` | [View Details](/managed-inference/reference-content/deepseek-r1-distill-llama-70b/) | [MIT license](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) |
31+
| Deepseek | `deepseek-r1-distill-llama-8b` | [View Details](/managed-inference/reference-content/deepseek-r1-distill-llama-8b/) | [MIT license](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) |
32+
| Meta | `llama-3-70b-instruct` | [View Details](/managed-inference/reference-content/llama-3-70b-instruct/) | [Llama 3 license](https://www.llama.com/llama3/license/) |
33+
| Meta | `llama-3-8b-instruct` | [View Details](/managed-inference/reference-content/llama-3-8b-instruct/) | [Llama 3 license](https://www.llama.com/llama3/license/) |
34+
| Meta | `llama-3.1-70b-instruct` | [View Details](/managed-inference/reference-content/llama-3.1-70b-instruct/) | [Llama 3.1 community license](https://www.llama.com/llama3_1/license/) |
35+
| Meta | `llama-3.1-8b-instruct` | [View Details](/managed-inference/reference-content/llama-3.1-8b-instruct/) | [Llama 3.1 license](https://www.llama.com/llama3_1/license/) |
36+
| Meta | `llama-3.3-70b-instruct` | [View Details](/managed-inference/reference-content/llama-3.3-70b-instruct/) | [Llama 3.3 license](https://www.llama.com/llama3_3/license/) |
37+
| Nvidia | `llama-3.1-nemotron-70b-instruct` | [View Details](/managed-inference/reference-content/llama-3.1-nemotron-70b-instruct/)| [Llama 3.1 community license](https://www.llama.com/llama3_1/license/) |
38+
| Mistral | `mixtral-8x7b-instruct-v0.1` | [View Details](/managed-inference/reference-content/mixtral-8x7b-instruct-v0.1/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
39+
| Mistral | `mistral-7b-instruct-v0.3` | [View Details](/managed-inference/reference-content/mistral-7b-instruct-v0.3/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
40+
| Mistral | `mistral-nemo-instruct-2407` | [View Details](/managed-inference/reference-content/mistral-nemo-instruct-2407/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
41+
| Mistral | `mistral-small-24b-instruct-2501` | [View Details](/managed-inference/reference-content/mistral-small-24b-instruct-2501/)| [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
42+
| Mistral | `pixtral-12b-2409` | [View Details](/managed-inference/reference-content/pixtral-12b-2409/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
43+
| Qwen | `qwen2.5-coder-32b-instruct` | [View Details](/managed-inference/reference-content/qwen2.5-coder-32b-instruct/) | [Apache 2.0 license](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE) |
3144

3245
### Vision models
3346

3447
_More details to be added._
3548

3649
### Embedding models
3750

38-
_More details to be added._
51+
| Provider | Model identifier | Documentation | License |
52+
|----------|------------------|----------------|---------|
53+
| BAAI | `bge-multilingual-gemma2` | [View Details](/managed-inference/reference-content/bge-multilingual-gemma2/) | [Gemma Terms of Use](https://ai.google.dev/gemma/terms) |
54+
| Sentence Transformers | `sentence-t5-xxl` | [View Details](/managed-inference/reference-content/sentence-t5-xxl/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
3955

4056

41-
## Custom Models
57+
## Custom models
4258

4359
<Message type="note">
4460
Custom model support is currently in **beta**. If you encounter issues or limitations, please report them via our [Slack community channel](https://scaleway-community.slack.com/archives/C01SGLGRLEA) or [customer support](https://console.scaleway.com/support/tickets/create?for=product&productName=inference).
@@ -56,30 +72,30 @@ To deploy a custom model via Hugging Face, ensure the following:
5672

5773
#### Access requirements
5874

59-
- You must have access to the model using your Hugging Face credentials.
60-
- For gated models, request access through your Hugging Face account.
61-
- Credentials are not stored, but we recommend using [read or fine-grained access tokens](https://huggingface.co/docs/hub/security-tokens).
75+
* You must have access to the model using your Hugging Face credentials.
76+
* For gated models, request access through your Hugging Face account.
77+
* Credentials are not stored, but we recommend using [read or fine-grained access tokens](https://huggingface.co/docs/hub/security-tokens).
6278

6379
#### Required files
6480

6581
Your model repository must include:
6682

67-
- `config.json` with:
68-
- An `architectures` array (see [supported architectures](#supported-models-architecture))
69-
- `max_position_embeddings`
70-
- Model weights in the [`.safetensors`](https://huggingface.co/docs/safetensors/index) format
71-
- A chat template included in either:
72-
- `tokenizer_config.json` as a `chat_template` field, or
73-
- `chat_template.json` as a `chat_template` field
83+
* A `config.json` file containig:
84+
* An `architectures` array (see [supported architectures](#supported-models-architecture) for the exact list of supported values).
85+
* `max_position_embeddings`
86+
* Model weights in the [`.safetensors`](https://huggingface.co/docs/safetensors/index) format
87+
* A chat template included in either:
88+
* `tokenizer_config.json` as a `chat_template` field, or
89+
* `chat_template.json` as a `chat_template` field
7490

7591
#### Supported model types
7692

7793
Your model must be one of the following types:
7894

79-
- `chat`
80-
- `vision`
81-
- `multimodal` (chat + vision)
82-
- `embedding`
95+
* `chat`
96+
* `vision`
97+
* `multimodal` (chat + vision)
98+
* `embedding`
8399

84100
<Message type="important">
85101
**Security Notice**<br />
@@ -88,16 +104,16 @@ Your model must be one of the following types:
88104

89105
## API support
90106

91-
Depending on your model type, the following endpoints will be available:
107+
Depending on the model type, specific endpoints and features will be supported.
92108

93109
### Chat models
94110

95-
Chat API will be expposed for this model under `/v1/chat/completions` endpoint.
111+
Chat API will be exposed for this model under `/v1/chat/completions` endpoint.
96112
**Structured outputs** or **Function calling** are not yet supported for custom models.
97113

98114
### Vision models
99115

100-
Chat API will be expposed for this model under `/v1/chat/completions` endpoint.
116+
Chat API will be exposed for this model under `/v1/chat/completions` endpoint.
101117
**Structured outputs** or **Function calling** are not yet supported for custom models.
102118

103119
### Multimodal models
@@ -123,6 +139,129 @@ When deploying custom models, **you remain responsible** for complying with any
123139
Custom models must conform to one of the architectures listed below. Click to expand full list.
124140

125141
<Concept>
126-
## Supported custom model architectures
127-
Custom Models Deployments currently support the following models architecture: `AquilaModel`, `AquilaForCausalLM`, `ArcticForCausalLM`, `BaiChuanForCausalLM`, `BaichuanForCausalLM`, `BloomForCausalLM`, `CohereForCausalLM`, `Cohere2ForCausalLM`, `DbrxForCausalLM`, `DeciLMForCausalLM`, `DeepseekForCausalLM`, `DeepseekV2ForCausalLM`, `DeepseekV3ForCausalLM`, `ExaoneForCausalLM`, `FalconForCausalLM`, `Fairseq2LlamaForCausalLM`, `GemmaForCausalLM`, `Gemma2ForCausalLM`, `GlmForCausalLM`, `GPT2LMHeadModel`, `GPTBigCodeForCausalLM`, `GPTJForCausalLM`, `GPTNeoXForCausalLM`, `GraniteForCausalLM`, `GraniteMoeForCausalLM`, `GritLM`, `InternLMForCausalLM`, `InternLM2ForCausalLM`, `InternLM2VEForCausalLM`, `InternLM3ForCausalLM`, `JAISLMHeadModel`, `JambaForCausalLM`, `LlamaForCausalLM`, `LLaMAForCausalLM`, `MambaForCausalLM`, `FalconMambaForCausalLM`, `MiniCPMForCausalLM`, `MiniCPM3ForCausalLM`, `MistralForCausalLM`, `MixtralForCausalLM`, `QuantMixtralForCausalLM`, `MptForCausalLM`, `MPTForCausalLM`, `NemotronForCausalLM`, `OlmoForCausalLM`, `Olmo2ForCausalLM`, `OlmoeForCausalLM`, `OPTForCausalLM`, `OrionForCausalLM`, `PersimmonForCausalLM`, `PhiForCausalLM`, `Phi3ForCausalLM`, `Phi3SmallForCausalLM`, `PhiMoEForCausalLM`, `Qwen2ForCausalLM`, `Qwen2MoeForCausalLM`, `RWForCausalLM`, `StableLMEpochForCausalLM`, `StableLmForCausalLM`, `Starcoder2ForCausalLM`, `SolarForCausalLM`, `TeleChat2ForCausalLM`, `XverseForCausalLM`, `BartModel`, `BartForConditionalGeneration`, `Florence2ForConditionalGeneration`, `BertModel`, `RobertaModel`, `RobertaForMaskedLM`, `XLMRobertaModel`, `DeciLMForCausalLM`, `Gemma2Model`, `GlmForCausalLM`, `GritLM`, `InternLM2ForRewardModel`, `JambaForSequenceClassification`, `LlamaModel`, `MistralModel`, `Phi3ForCausalLM`, `Qwen2Model`, `Qwen2ForCausalLM`, `Qwen2ForRewardModel`, `Qwen2ForProcessRewardModel`, `TeleChat2ForCausalLM`, `LlavaNextForConditionalGeneration`, `Phi3VForCausalLM`, `Qwen2VLForConditionalGeneration`, `Qwen2ForSequenceClassification`, `BertForSequenceClassification`, `RobertaForSequenceClassification`, `XLMRobertaForSequenceClassification`, `AriaForConditionalGeneration`, `Blip2ForConditionalGeneration`, `ChameleonForConditionalGeneration`, `ChatGLMModel`, `ChatGLMForConditionalGeneration`, `DeepseekVLV2ForCausalLM`, `FuyuForCausalLM`, `H2OVLChatModel`, `InternVLChatModel`, `Idefics3ForConditionalGeneration`, `LlavaForConditionalGeneration`, `LlavaNextForConditionalGeneration`, `LlavaNextVideoForConditionalGeneration`, `LlavaOnevisionForConditionalGeneration`, `MantisForConditionalGeneration`, `MiniCPMO`, `MiniCPMV`, `MolmoForCausalLM`, `NVLM_D`, `PaliGemmaForConditionalGeneration`, `Phi3VForCausalLM`, `PixtralForConditionalGeneration`, `QWenLMHeadModel`, `Qwen2VLForConditionalGeneration`, `Qwen2_5_VLForConditionalGeneration`, `Qwen2AudioForConditionalGeneration`, `UltravoxModel`, `MllamaForConditionalGeneration`, `WhisperForConditionalGeneration`, `EAGLEModel`, `MedusaModel`, `MLPSpeculatorPreTrainedModel`
142+
## Supported custom model architectures
143+
Custom models deployment currently supports the following model architectures:
144+
* `AquilaModel`
145+
* `AquilaForCausalLM`
146+
* `ArcticForCausalLM`
147+
* `BaiChuanForCausalLM`
148+
* `BaichuanForCausalLM`
149+
* `BloomForCausalLM`
150+
* `CohereForCausalLM`
151+
* `Cohere2ForCausalLM`
152+
* `DbrxForCausalLM`
153+
* `DeciLMForCausalLM`
154+
* `DeepseekForCausalLM`
155+
* `DeepseekV2ForCausalLM`
156+
* `DeepseekV3ForCausalLM`
157+
* `ExaoneForCausalLM`
158+
* `FalconForCausalLM`
159+
* `Fairseq2LlamaForCausalLM`
160+
* `GemmaForCausalLM`
161+
* `Gemma2ForCausalLM`
162+
* `GlmForCausalLM`
163+
* `GPT2LMHeadModel`
164+
* `GPTBigCodeForCausalLM`
165+
* `GPTJForCausalLM`
166+
* `GPTNeoXForCausalLM`
167+
* `GraniteForCausalLM`
168+
* `GraniteMoeForCausalLM`
169+
* `GritLM`
170+
* `InternLMForCausalLM`
171+
* `InternLM2ForCausalLM`
172+
* `InternLM2VEForCausalLM`
173+
* `InternLM3ForCausalLM`
174+
* `JAISLMHeadModel`
175+
* `JambaForCausalLM`
176+
* `LlamaForCausalLM`
177+
* `LLaMAForCausalLM`
178+
* `MambaForCausalLM`
179+
* `FalconMambaForCausalLM`
180+
* `MiniCPMForCausalLM`
181+
* `MiniCPM3ForCausalLM`
182+
* `MistralForCausalLM`
183+
* `MixtralForCausalLM`
184+
* `QuantMixtralForCausalLM`
185+
* `MptForCausalLM`
186+
* `MPTForCausalLM`
187+
* `NemotronForCausalLM`
188+
* `OlmoForCausalLM`
189+
* `Olmo2ForCausalLM`
190+
* `OlmoeForCausalLM`
191+
* `OPTForCausalLM`
192+
* `OrionForCausalLM`
193+
* `PersimmonForCausalLM`
194+
* `PhiForCausalLM`
195+
* `Phi3ForCausalLM`
196+
* `Phi3SmallForCausalLM`
197+
* `PhiMoEForCausalLM`
198+
* `Qwen2ForCausalLM`
199+
* `Qwen2MoeForCausalLM`
200+
* `RWForCausalLM`
201+
* `StableLMEpochForCausalLM`
202+
* `StableLmForCausalLM`
203+
* `Starcoder2ForCausalLM`
204+
* `SolarForCausalLM`
205+
* `TeleChat2ForCausalLM`
206+
* `XverseForCausalLM`
207+
* `BartModel`
208+
* `BartForConditionalGeneration`
209+
* `Florence2ForConditionalGeneration`
210+
* `BertModel`
211+
* `RobertaModel`
212+
* `RobertaForMaskedLM`
213+
* `XLMRobertaModel`
214+
* `DeciLMForCausalLM`
215+
* `Gemma2Model`
216+
* `GlmForCausalLM`
217+
* `GritLM`
218+
* `InternLM2ForRewardModel`
219+
* `JambaForSequenceClassification`
220+
* `LlamaModel`
221+
* `MistralModel`
222+
* `Phi3ForCausalLM`
223+
* `Qwen2Model`
224+
* `Qwen2ForCausalLM`
225+
* `Qwen2ForRewardModel`
226+
* `Qwen2ForProcessRewardModel`
227+
* `TeleChat2ForCausalLM`
228+
* `LlavaNextForConditionalGeneration`
229+
* `Phi3VForCausalLM`
230+
* `Qwen2VLForConditionalGeneration`
231+
* `Qwen2ForSequenceClassification`
232+
* `BertForSequenceClassification`
233+
* `RobertaForSequenceClassification`
234+
* `XLMRobertaForSequenceClassification`
235+
* `AriaForConditionalGeneration`
236+
* `Blip2ForConditionalGeneration`
237+
* `ChameleonForConditionalGeneration`
238+
* `ChatGLMModel`
239+
* `ChatGLMForConditionalGeneration`
240+
* `DeepseekVLV2ForCausalLM`
241+
* `FuyuForCausalLM`
242+
* `H2OVLChatModel`
243+
* `InternVLChatModel`
244+
* `Idefics3ForConditionalGeneration`
245+
* `LlavaForConditionalGeneration`
246+
* `LlavaNextForConditionalGeneration`
247+
* `LlavaNextVideoForConditionalGeneration`
248+
* `LlavaOnevisionForConditionalGeneration`
249+
* `MantisForConditionalGeneration`
250+
* `MiniCPMO`
251+
* `MiniCPMV`
252+
* `MolmoForCausalLM`
253+
* `NVLM_D`
254+
* `PaliGemmaForConditionalGeneration`
255+
* `Phi3VForCausalLM`
256+
* `PixtralForConditionalGeneration`
257+
* `QWenLMHeadModel`
258+
* `Qwen2VLForConditionalGeneration`
259+
* `Qwen2_5_VLForConditionalGeneration`
260+
* `Qwen2AudioForConditionalGeneration`
261+
* `UltravoxModel`
262+
* `MllamaForConditionalGeneration`
263+
* `WhisperForConditionalGeneration`
264+
* `EAGLEModel`
265+
* `MedusaModel`
266+
* `MLPSpeculatorPreTrainedModel`
128267
</Concept>

0 commit comments

Comments
 (0)