Skip to content

Commit 2b5d57e

Browse files
authored
fix(genapi): update model context length (#5715)
* fix(genapi): update model context length * fix(inference): qwen 235b context length * fix(genapi): bge-multilingual-gemma2 context window * fix(inference): bge-multilingual-gemma2 context window
1 parent 9efd65e commit 2b5d57e

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

pages/generative-apis/reference-content/supported-models.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Our API supports the most popular models for [Chat](/generative-apis/how-to/quer
3939
| Meta | `llama-3.3-70b-instruct` | 100k | 4096 | [Llama 3.3 Community](https://www.llama.com/llama3_3/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) |
4040
| Meta | `llama-3.1-8b-instruct` | 128k | 16384 | [Llama 3.1 Community](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) |
4141
| Mistral | `mistral-nemo-instruct-2407` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |
42-
| Qwen | `qwen3-235b-a22b-instruct-2507` | 260k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507) |
42+
| Qwen | `qwen3-235b-a22b-instruct-2507` | 250k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507) |
4343
| Qwen | `qwen3-coder-30b-a3b-instruct` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct) |
4444
| DeepSeek | `deepseek-r1-distill-llama-70b` | 32k | 4096 | [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) | [HF](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B) |
4545

@@ -63,7 +63,7 @@ Our [Embeddings API](/generative-apis/how-to/query-embedding-models) provides bu
6363

6464
| Provider | Model string | Model size | Embedding dimension | Context window | License | Model card |
6565
|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
66-
| BAAI | `bge-multilingual-gemma2` | 9B | 3584 | 4096 | [Gemma](https://ai.google.dev/gemma/terms) | [HF](https://huggingface.co/BAAI/bge-multilingual-gemma2) |
66+
| BAAI | `bge-multilingual-gemma2` | 9B | 3584 | 8192 | [Gemma](https://ai.google.dev/gemma/terms) | [HF](https://huggingface.co/BAAI/bge-multilingual-gemma2) |
6767

6868
## Request a model
6969

pages/managed-inference/reference-content/model-catalog.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
1818
|------------|----------|--------------|------------|-----------|---------|
1919
| [`gpt-oss-120b`](#gpt-oss-120b) | OpenAI | 128k | Text | H100 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
2020
| [`whisper-large-v3`](#whisper-large-v3) | OpenAI | - | Audio transcription | L4, L40S, H100, H100-SXM-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
21-
| [`qwen3-235b-a22b-instruct-2507`](#qwen3-235b-a22b-instruct-2507) | Qwen | 40k | Text | H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
21+
| [`qwen3-235b-a22b-instruct-2507`](#qwen3-235b-a22b-instruct-2507) | Qwen | 250k | Text | H100-SXM-2 (40k), H100-SXM-4 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
2222
| [`gemma-3-27b-it`](#gemma-3-27b-it) | Google | 40k | Text, Vision | H100, H100-2 | [Gemma](https://ai.google.dev/gemma/terms) |
2323
| [`llama-3.3-70b-instruct`](#llama-33-70b-instruct) | Meta | 128k | Text | H100 (15k), H100-2 | [Llama 3.3 Community](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) |
2424
| [`llama-3.1-70b-instruct`](#llama-31-70b-instruct) | Meta | 128k | Text | H100 (15k), H100-2 | [Llama 3.1 Community](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct/blob/main/LICENSE) |
@@ -40,7 +40,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
4040
| [`molmo-72b-0924`](#molmo-72b-0924) | Allen AI | 50k | Text, Vision | H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) and [Twonyi Qianwen license](https://huggingface.co/Qwen/Qwen2-72B/blob/main/LICENSE)|
4141
| [`qwen3-coder-30b-a3b-instruct`](#qwen3-coder-30b-a3b-instruct) | Qwen | 128k | Code | L40S, H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
4242
| [`qwen2.5-coder-32b-instruct`](#qwen25-coder-32b-instruct) | Qwen | 32k | Code | H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
43-
| [`bge-multilingual-gemma2`](#bge-multilingual-gemma2) | BAAI | 4k | Embeddings | L4, L40S, H100, H100-2 | [Gemma](https://ai.google.dev/gemma/terms) |
43+
| [`bge-multilingual-gemma2`](#bge-multilingual-gemma2) | BAAI | 8k | Embeddings | L4, L40S, H100, H100-2 | [Gemma](https://ai.google.dev/gemma/terms) |
4444
| [`sentence-t5-xxl`](#sentence-t5-xxl) | Sentence transformers | 512 | Embeddings | L4 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
4545

4646
\*Maximum context length is only mentioned when instances VRAM size limits context length. Otherwise, maximum context length is the one defined by the model.

0 commit comments

Comments
 (0)