fix(inference): qwen maximum context size

fpagny · web-flow · commit 6640b36396e6 · 2025-03-20T14:33:42.000+01:00
diff --git a/pages/managed-inference/reference-content/qwen2.5-coder-32b-instruct.mdx b/pages/managed-inference/reference-content/qwen2.5-coder-32b-instruct.mdx
@@ -20,7 +20,7 @@ categories:
 | Provider        | [Qwen](https://qwenlm.github.io/)  |
 | License        | [Apache 2.0](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE)  |
 | Compatible Instances | H100, H100-2 (INT8) |
-| Context Length | up to 128k tokens |
+| Context Length | up to 32k tokens |
 
 ## Model names
 
@@ -32,8 +32,8 @@ qwen/qwen2.5-coder-32b-instruct:int8
 
 | Instance type  | Max context length |
 | ------------- |-------------|
-| H100      | 128k (INT8)
-| H100-2      | 128k (INT8)
+| H100      | 32k (INT8)
+| H100-2      | 32k (INT8)
 
 ## Model introduction
 
@@ -75,4 +75,4 @@ Process the output data according to your application's needs. The response will
 
 <Message type="note">
   Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
-</Message>
+</Message>