You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Scaleway Managed Inference allows you to deploy various AI models, either from:
17
17
18
-
-[Scaleway catalog](#scaleway-catalog): A curated set of ready-to-deploy models available through the [Scaleway console](https://console.scaleway.com/inference/deployments/) or the [Managed Inference models API](https://www.scaleway.com/en/developers/api/inference/#path-models-list-models)
19
-
-[Custom models](#custom-models): Models that you import, typically from sources like Hugging Face.
18
+
*[Scaleway catalog](#scaleway-catalog): A curated set of ready-to-deploy models available through the [Scaleway console](https://console.scaleway.com/inference/deployments/) or the [Managed Inference models API](https://www.scaleway.com/en/developers/api/inference/#path-models-list-models)
19
+
*[Custom models](#custom-models): Models that you import, typically from sources like Hugging Face.
20
20
21
21
## Scaleway catalog
22
22
23
23
### Multimodal models (chat + vision)
24
24
25
25
### Chat models
26
26
27
-
| Provider | Model identifier | Documentation | License |
| Allen AI |`molmo-72b-0924`|[View Details](/managed-inference/reference-content/molmo-72b-0924/)|[Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0)|
| Meta |`llama-3-70b-instruct`|[View Details](/managed-inference/reference-content/llama-3-70b-instruct/)|[Llama 3 license](https://www.llama.com/llama3/license/)|
33
+
| Meta |`llama-3-8b-instruct`|[View Details](/managed-inference/reference-content/llama-3-8b-instruct/)|[Llama 3 license](https://www.llama.com/llama3/license/)|
34
+
| Meta |`llama-3.1-70b-instruct`|[View Details](/managed-inference/reference-content/llama-3.1-70b-instruct/)|[Llama 3.1 community license](https://www.llama.com/llama3_1/license/)|
35
+
| Meta |`llama-3.1-8b-instruct`|[View Details](/managed-inference/reference-content/llama-3.1-8b-instruct/)|[Llama 3.1 license](https://www.llama.com/llama3_1/license/)|
36
+
| Meta |`llama-3.3-70b-instruct`|[View Details](/managed-inference/reference-content/llama-3.3-70b-instruct/)|[Llama 3.3 license](https://www.llama.com/llama3_3/license/)|
37
+
| Nvidia |`llama-3.1-nemotron-70b-instruct`|[View Details](/managed-inference/reference-content/llama-3.1-nemotron-70b-instruct/)|[Llama 3.1 community license](https://www.llama.com/llama3_1/license/)|
Custom model support is currently in **beta**. If you encounter issues or limitations, please report them via our [Slack community channel](https://scaleway-community.slack.com/archives/C01SGLGRLEA) or [customer support](https://console.scaleway.com/support/tickets/create?for=product&productName=inference).
@@ -56,30 +72,30 @@ To deploy a custom model via Hugging Face, ensure the following:
56
72
57
73
#### Access requirements
58
74
59
-
- You must have access to the model using your Hugging Face credentials.
60
-
- For gated models, request access through your Hugging Face account.
61
-
- Credentials are not stored, but we recommend using [read or fine-grained access tokens](https://huggingface.co/docs/hub/security-tokens).
75
+
* You must have access to the model using your Hugging Face credentials.
76
+
* For gated models, request access through your Hugging Face account.
77
+
* Credentials are not stored, but we recommend using [read or fine-grained access tokens](https://huggingface.co/docs/hub/security-tokens).
62
78
63
79
#### Required files
64
80
65
81
Your model repository must include:
66
82
67
-
-`config.json`with:
68
-
-An `architectures` array (see [supported architectures](#supported-models-architecture))
69
-
-`max_position_embeddings`
70
-
- Model weights in the [`.safetensors`](https://huggingface.co/docs/safetensors/index) format
71
-
- A chat template included in either:
72
-
-`tokenizer_config.json` as a `chat_template` field, or
73
-
-`chat_template.json` as a `chat_template` field
83
+
* A `config.json`file containig:
84
+
*An `architectures` array (see [supported architectures](#supported-models-architecture) for the exact list of supported values).
85
+
*`max_position_embeddings`
86
+
* Model weights in the [`.safetensors`](https://huggingface.co/docs/safetensors/index) format
87
+
* A chat template included in either:
88
+
*`tokenizer_config.json` as a `chat_template` field, or
89
+
*`chat_template.json` as a `chat_template` field
74
90
75
91
#### Supported model types
76
92
77
93
Your model must be one of the following types:
78
94
79
-
-`chat`
80
-
-`vision`
81
-
-`multimodal` (chat + vision)
82
-
-`embedding`
95
+
*`chat`
96
+
*`vision`
97
+
*`multimodal` (chat + vision)
98
+
*`embedding`
83
99
84
100
<Messagetype="important">
85
101
**Security Notice**<br />
@@ -88,16 +104,16 @@ Your model must be one of the following types:
88
104
89
105
## API support
90
106
91
-
Depending on your model type, the following endpoints will be available:
107
+
Depending on the model type, specific endpoints and features will be supported.
92
108
93
109
### Chat models
94
110
95
-
Chat API will be expposed for this model under `/v1/chat/completions` endpoint.
111
+
Chat API will be exposed for this model under `/v1/chat/completions` endpoint.
96
112
**Structured outputs** or **Function calling** are not yet supported for custom models.
97
113
98
114
### Vision models
99
115
100
-
Chat API will be expposed for this model under `/v1/chat/completions` endpoint.
116
+
Chat API will be exposed for this model under `/v1/chat/completions` endpoint.
101
117
**Structured outputs** or **Function calling** are not yet supported for custom models.
102
118
103
119
### Multimodal models
@@ -123,6 +139,129 @@ When deploying custom models, **you remain responsible** for complying with any
123
139
Custom models must conform to one of the architectures listed below. Click to expand full list.
0 commit comments