Update llama-3.3-70b-instruct.mdx

fpagny · web-flow · commit 5007743681c9 · 2025-03-04T13:23:29.000+01:00
diff --git a/pages/managed-inference/reference-content/llama-3.3-70b-instruct.mdx b/pages/managed-inference/reference-content/llama-3.3-70b-instruct.mdx
@@ -19,8 +19,8 @@ categories:
 |-----------------|------------------------------------|
 | Provider        | [Meta](https://www.llama.com/)  |
 | License        | [Llama 3.3 community](https://www.llama.com/llama3_3/license/)  |
-| Compatible Instances | H100-2 (BF16) |
-| Context length | Up to 70k tokens    |
+| Compatible Instances | H100 (FP8), H100-2 (FP8, BF16) |
+| Context length | Up to 131k tokens    |
 
 ## Model names
 
@@ -32,7 +32,8 @@ meta/llama-3.3-70b-instruct:bf16
 
 | Instance type  | Max context length |
 | ------------- |-------------|
-| H100-2      | 62k (BF16) |
+| H100      | 15k (FP8) |
+| H100-2      | 131k (FP8), 62k (BF16) |
 
 ## Model introduction
 
@@ -76,4 +77,4 @@ Process the output data according to your application's needs. The response will
 
 <Message type="note">
   Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
-</Message>
+</Message>