Skip to content

Commit 6640b36

Browse files
authored
fix(inference): qwen maximum context size
1 parent 93f1261 commit 6640b36

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

pages/managed-inference/reference-content/qwen2.5-coder-32b-instruct.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ categories:
2020
| Provider | [Qwen](https://qwenlm.github.io/) |
2121
| License | [Apache 2.0](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE) |
2222
| Compatible Instances | H100, H100-2 (INT8) |
23-
| Context Length | up to 128k tokens |
23+
| Context Length | up to 32k tokens |
2424

2525
## Model names
2626

@@ -32,8 +32,8 @@ qwen/qwen2.5-coder-32b-instruct:int8
3232

3333
| Instance type | Max context length |
3434
| ------------- |-------------|
35-
| H100 | 128k (INT8)
36-
| H100-2 | 128k (INT8)
35+
| H100 | 32k (INT8)
36+
| H100-2 | 32k (INT8)
3737

3838
## Model introduction
3939

@@ -75,4 +75,4 @@ Process the output data according to your application's needs. The response will
7575

7676
<Message type="note">
7777
Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
78-
</Message>
78+
</Message>

0 commit comments

Comments
 (0)