diff --git a/menu/navigation.json b/menu/navigation.json
index e4e2247785..e5d1696197 100644
--- a/menu/navigation.json
+++ b/menu/navigation.json
@@ -860,6 +860,10 @@
"label": "OpenAI API compatibility",
"slug": "openai-compatibility"
},
+ {
+ "label": "Supported models in Managed Inference",
+ "slug": "supported-models"
+ },
{
"label": "Support for function calling in Scaleway Managed Inference",
"slug": "function-calling-support"
diff --git a/pages/managed-inference/how-to/create-deployment.mdx b/pages/managed-inference/how-to/create-deployment.mdx
index 12a15a8b57..1b43cd5ee8 100644
--- a/pages/managed-inference/how-to/create-deployment.mdx
+++ b/pages/managed-inference/how-to/create-deployment.mdx
@@ -7,7 +7,7 @@ content:
paragraph: This page explains how to deploy a model on Scaleway Managed Inference
tags: managed-inference ai-data creating dedicated
dates:
- validation: 2025-04-01
+ validation: 2025-04-09
posted: 2024-03-06
---
@@ -19,7 +19,10 @@ dates:
1. Click the **AI & Data** section of the [Scaleway console](https://console.scaleway.com/), and select **Managed Inference** from the side menu to access the Managed Inference dashboard.
2. Click **Deploy a model** to launch the model deployment wizard.
3. Provide the necessary information:
- - Select the desired model and quantization to use for your deployment [from the available options](/managed-inference/reference-content/)
+ - Select the desired model and quantization to use for your deployment [from the available options](/managed-inference/reference-content/).
+
+ Scaleway Managed Inference allows you to deploy various AI models, either from the Scaleway catalog or by importing a custom model. For detailed information about supported models, visit our [Supported models in Managed Inference](/managed-inference/reference-content/supported-models/) documentation.
+
Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.
diff --git a/pages/managed-inference/quickstart.mdx b/pages/managed-inference/quickstart.mdx
index 697316fb36..48eb89b0b3 100644
--- a/pages/managed-inference/quickstart.mdx
+++ b/pages/managed-inference/quickstart.mdx
@@ -38,7 +38,10 @@ Here are some of the key features of Scaleway Managed Inference:
1. Navigate to the **AI & Data** section of the [Scaleway console](https://console.scaleway.com/), and select **Managed Inference** from the side menu to access the Managed Inference dashboard.
2. Click **Create deployment** to launch the deployment creation wizard.
3. Provide the necessary information:
- - Select the desired model and the quantization to use for your deployment [from the available options](/managed-inference/reference-content/)
+ - Select the desired model and the quantization to use for your deployment [from the available options](/managed-inference/reference-content/).
+
+ Scaleway Managed Inference allows you to deploy various AI models, either from the Scaleway catalog or by importing a custom model. For detailed information about supported models, visit our [Supported models in Managed Inference](/managed-inference/reference-content/supported-models/) documentation.
+
Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.
diff --git a/pages/managed-inference/reference-content/supported-models.mdx b/pages/managed-inference/reference-content/supported-models.mdx
new file mode 100644
index 0000000000..845be58327
--- /dev/null
+++ b/pages/managed-inference/reference-content/supported-models.mdx
@@ -0,0 +1,269 @@
+---
+meta:
+ title: Supported models in Managed Inference
+ description: Explore all AI models supported by Managed Inference
+content:
+ h1: Supported models in Managed Inference
+ paragraph: Discover which AI models you can deploy using Managed Inference, either from the Scaleway Catalog or as custom models.
+tags: support models custom catalog
+dates:
+ validation: 2025-04-08
+ posted: 2025-04-08
+categories:
+ - ai-data
+---
+
+Scaleway Managed Inference allows you to deploy various AI models, either from:
+
+ * [Scaleway catalog](#scaleway-catalog): A curated set of ready-to-deploy models available through the [Scaleway console](https://console.scaleway.com/inference/deployments/) or the [Managed Inference models API](https://www.scaleway.com/en/developers/api/inference/#path-models-list-models)
+ * [Custom models](#custom-models): Models that you import, typically from sources like Hugging Face.
+
+## Scaleway catalog
+
+### Multimodal models (chat + vision)
+
+_More details to be added._
+
+### Chat models
+
+| Provider | Model identifier | Documentation | License |
+|------------|-----------------------------------|--------------------------------------------------------------------------|-------------------------------------------------------|
+| Allen AI | `molmo-72b-0924` | [View Details](/managed-inference/reference-content/molmo-72b-0924/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
+| Deepseek | `deepseek-r1-distill-llama-70b` | [View Details](/managed-inference/reference-content/deepseek-r1-distill-llama-70b/) | [MIT license](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) |
+| Deepseek | `deepseek-r1-distill-llama-8b` | [View Details](/managed-inference/reference-content/deepseek-r1-distill-llama-8b/) | [MIT license](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) |
+| Meta | `llama-3-70b-instruct` | [View Details](/managed-inference/reference-content/llama-3-70b-instruct/) | [Llama 3 license](https://www.llama.com/llama3/license/) |
+| Meta | `llama-3-8b-instruct` | [View Details](/managed-inference/reference-content/llama-3-8b-instruct/) | [Llama 3 license](https://www.llama.com/llama3/license/) |
+| Meta | `llama-3.1-70b-instruct` | [View Details](/managed-inference/reference-content/llama-3.1-70b-instruct/) | [Llama 3.1 community license](https://www.llama.com/llama3_1/license/) |
+| Meta | `llama-3.1-8b-instruct` | [View Details](/managed-inference/reference-content/llama-3.1-8b-instruct/) | [Llama 3.1 license](https://www.llama.com/llama3_1/license/) |
+| Meta | `llama-3.3-70b-instruct` | [View Details](/managed-inference/reference-content/llama-3.3-70b-instruct/) | [Llama 3.3 license](https://www.llama.com/llama3_3/license/) |
+| Nvidia | `llama-3.1-nemotron-70b-instruct` | [View Details](/managed-inference/reference-content/llama-3.1-nemotron-70b-instruct/)| [Llama 3.1 community license](https://www.llama.com/llama3_1/license/) |
+| Mistral | `mixtral-8x7b-instruct-v0.1` | [View Details](/managed-inference/reference-content/mixtral-8x7b-instruct-v0.1/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
+| Mistral | `mistral-7b-instruct-v0.3` | [View Details](/managed-inference/reference-content/mistral-7b-instruct-v0.3/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
+| Mistral | `mistral-nemo-instruct-2407` | [View Details](/managed-inference/reference-content/mistral-nemo-instruct-2407/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
+| Mistral | `mistral-small-24b-instruct-2501` | [View Details](/managed-inference/reference-content/mistral-small-24b-instruct-2501/)| [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
+| Mistral | `pixtral-12b-2409` | [View Details](/managed-inference/reference-content/pixtral-12b-2409/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
+| Qwen | `qwen2.5-coder-32b-instruct` | [View Details](/managed-inference/reference-content/qwen2.5-coder-32b-instruct/) | [Apache 2.0 license](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE) |
+
+### Vision models
+
+_More details to be added._
+
+### Embedding models
+
+| Provider | Model identifier | Documentation | License |
+|----------|------------------|----------------|---------|
+| BAAI | `bge-multilingual-gemma2` | [View Details](/managed-inference/reference-content/bge-multilingual-gemma2/) | [Gemma Terms of Use](https://ai.google.dev/gemma/terms) |
+| Sentence Transformers | `sentence-t5-xxl` | [View Details](/managed-inference/reference-content/sentence-t5-xxl/) | [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0) |
+
+
+## Custom models
+
+
+ Custom model support is currently in **beta**. If you encounter issues or limitations, please report them via our [Slack community channel](https://scaleway-community.slack.com/archives/C01SGLGRLEA) or [customer support](https://console.scaleway.com/support/tickets/create?for=product&productName=inference).
+
+
+### Prerequisites
+
+
+ We recommend starting with a variation of a supported model from the Scaleway catalog.
+ For example, you can deploy a [quantized (4-bit) version of Llama 3.3](https://huggingface.co/unsloth/Llama-3.3-70B-Instruct-bnb-4bit).
+ If deploying a fine-tuned version of Llama 3.3, make sure your file structure matches the example linked above.
+
+
+To deploy a custom model via Hugging Face, ensure the following:
+
+#### Access requirements
+
+ * You must have access to the model using your Hugging Face credentials.
+ * For gated models, request access through your Hugging Face account.
+ * Credentials are not stored, but we recommend using [read or fine-grained access tokens](https://huggingface.co/docs/hub/security-tokens).
+
+#### Required files
+
+Your model repository must include:
+
+ * A `config.json` file containig:
+ * An `architectures` array (see [supported architectures](#supported-models-architecture) for the exact list of supported values).
+ * `max_position_embeddings`
+ * Model weights in the [`.safetensors`](https://huggingface.co/docs/safetensors/index) format
+ * A chat template included in either:
+ * `tokenizer_config.json` as a `chat_template` field, or
+ * `chat_template.json` as a `chat_template` field
+
+#### Supported model types
+
+Your model must be one of the following types:
+
+ * `chat`
+ * `vision`
+ * `multimodal` (chat + vision)
+ * `embedding`
+
+
+ **Security Notice**
+ Models using formats that allow arbitrary code execution, such as Python [`pickle`](https://docs.python.org/3/library/pickle.html), are **not supported**.
+
+
+## API support
+
+Depending on the model type, specific endpoints and features will be supported.
+
+### Chat models
+
+The Chat API will be exposed for this model under `/v1/chat/completions` endpoint.
+**Structured outputs** or **Function calling** are not yet supported for custom models.
+
+### Vision models
+
+Chat API will be exposed for this model under `/v1/chat/completions` endpoint.
+**Structured outputs** or **Function calling** are not yet supported for custom models.
+
+### Multimodal models
+
+These models will be treated similarly to both Chat and Vision models.
+
+### Embedding models
+
+Embeddings API will be exposed for this model under `/v1/embeddings` endpoint.
+
+
+## Custom model lifecycle
+
+Currently, custom model deployments are considered to be valid for the long term, and we will ensure any updates or changes to Managed Inference will not impact existing deployments.
+In case of breaking changes, leading to some custom models not being supported anymore, we will notify you **at least 3 months beforehand**.
+
+## Licensing
+
+When deploying custom models, **you remain responsible** for complying with any License requirements from the model provider, as you would do by running the model on a custom provisioned GPU.
+
+## Supported model architectures
+
+Custom models must conform to one of the architectures listed below. Click to expand full list.
+
+
+ ## Supported custom model architectures
+ Custom model deployment currently supports the following model architectures:
+ * `AquilaModel`
+ * `AquilaForCausalLM`
+ * `ArcticForCausalLM`
+ * `BaiChuanForCausalLM`
+ * `BaichuanForCausalLM`
+ * `BloomForCausalLM`
+ * `CohereForCausalLM`
+ * `Cohere2ForCausalLM`
+ * `DbrxForCausalLM`
+ * `DeciLMForCausalLM`
+ * `DeepseekForCausalLM`
+ * `DeepseekV2ForCausalLM`
+ * `DeepseekV3ForCausalLM`
+ * `ExaoneForCausalLM`
+ * `FalconForCausalLM`
+ * `Fairseq2LlamaForCausalLM`
+ * `GemmaForCausalLM`
+ * `Gemma2ForCausalLM`
+ * `GlmForCausalLM`
+ * `GPT2LMHeadModel`
+ * `GPTBigCodeForCausalLM`
+ * `GPTJForCausalLM`
+ * `GPTNeoXForCausalLM`
+ * `GraniteForCausalLM`
+ * `GraniteMoeForCausalLM`
+ * `GritLM`
+ * `InternLMForCausalLM`
+ * `InternLM2ForCausalLM`
+ * `InternLM2VEForCausalLM`
+ * `InternLM3ForCausalLM`
+ * `JAISLMHeadModel`
+ * `JambaForCausalLM`
+ * `LlamaForCausalLM`
+ * `LLaMAForCausalLM`
+ * `MambaForCausalLM`
+ * `FalconMambaForCausalLM`
+ * `MiniCPMForCausalLM`
+ * `MiniCPM3ForCausalLM`
+ * `MistralForCausalLM`
+ * `MixtralForCausalLM`
+ * `QuantMixtralForCausalLM`
+ * `MptForCausalLM`
+ * `MPTForCausalLM`
+ * `NemotronForCausalLM`
+ * `OlmoForCausalLM`
+ * `Olmo2ForCausalLM`
+ * `OlmoeForCausalLM`
+ * `OPTForCausalLM`
+ * `OrionForCausalLM`
+ * `PersimmonForCausalLM`
+ * `PhiForCausalLM`
+ * `Phi3ForCausalLM`
+ * `Phi3SmallForCausalLM`
+ * `PhiMoEForCausalLM`
+ * `Qwen2ForCausalLM`
+ * `Qwen2MoeForCausalLM`
+ * `RWForCausalLM`
+ * `StableLMEpochForCausalLM`
+ * `StableLmForCausalLM`
+ * `Starcoder2ForCausalLM`
+ * `SolarForCausalLM`
+ * `TeleChat2ForCausalLM`
+ * `XverseForCausalLM`
+ * `BartModel`
+ * `BartForConditionalGeneration`
+ * `Florence2ForConditionalGeneration`
+ * `BertModel`
+ * `RobertaModel`
+ * `RobertaForMaskedLM`
+ * `XLMRobertaModel`
+ * `DeciLMForCausalLM`
+ * `Gemma2Model`
+ * `GlmForCausalLM`
+ * `GritLM`
+ * `InternLM2ForRewardModel`
+ * `JambaForSequenceClassification`
+ * `LlamaModel`
+ * `MistralModel`
+ * `Phi3ForCausalLM`
+ * `Qwen2Model`
+ * `Qwen2ForCausalLM`
+ * `Qwen2ForRewardModel`
+ * `Qwen2ForProcessRewardModel`
+ * `TeleChat2ForCausalLM`
+ * `LlavaNextForConditionalGeneration`
+ * `Phi3VForCausalLM`
+ * `Qwen2VLForConditionalGeneration`
+ * `Qwen2ForSequenceClassification`
+ * `BertForSequenceClassification`
+ * `RobertaForSequenceClassification`
+ * `XLMRobertaForSequenceClassification`
+ * `AriaForConditionalGeneration`
+ * `Blip2ForConditionalGeneration`
+ * `ChameleonForConditionalGeneration`
+ * `ChatGLMModel`
+ * `ChatGLMForConditionalGeneration`
+ * `DeepseekVLV2ForCausalLM`
+ * `FuyuForCausalLM`
+ * `H2OVLChatModel`
+ * `InternVLChatModel`
+ * `Idefics3ForConditionalGeneration`
+ * `LlavaForConditionalGeneration`
+ * `LlavaNextForConditionalGeneration`
+ * `LlavaNextVideoForConditionalGeneration`
+ * `LlavaOnevisionForConditionalGeneration`
+ * `MantisForConditionalGeneration`
+ * `MiniCPMO`
+ * `MiniCPMV`
+ * `MolmoForCausalLM`
+ * `NVLM_D`
+ * `PaliGemmaForConditionalGeneration`
+ * `Phi3VForCausalLM`
+ * `PixtralForConditionalGeneration`
+ * `QWenLMHeadModel`
+ * `Qwen2VLForConditionalGeneration`
+ * `Qwen2_5_VLForConditionalGeneration`
+ * `Qwen2AudioForConditionalGeneration`
+ * `UltravoxModel`
+ * `MllamaForConditionalGeneration`
+ * `WhisperForConditionalGeneration`
+ * `EAGLEModel`
+ * `MedusaModel`
+ * `MLPSpeculatorPreTrainedModel`
+
\ No newline at end of file