feat(changelog): generative-apis-changed-llama-33-70b-maximum-context-up 2025-06-25 (#5184)

ofranc · Changelog bot · bene2k1 · web-flow · commit 038be81e15e9 · 2025-06-26T10:49:06.000+02:00
* feat(changelog): add new entry

* Update changelog/june2025/2025-06-25-generative-apis-changed-llama-33-70b-maximum-context-up.mdx

Co-authored-by: Rowena Jones &lt;36301604+RoRoJ@users.noreply.github.com&gt;

---------

Co-authored-by: Changelog bot &lt;changelog-bot@users.noreply.github.com&gt;
Co-authored-by: Benedikt Rollik &lt;brollik@scaleway.com&gt;
Co-authored-by: Rowena Jones &lt;36301604+RoRoJ@users.noreply.github.com&gt;
diff --git a/changelog/june2025/2025-06-25-generative-apis-changed-llama-33-70b-maximum-context-up.mdx b/changelog/june2025/2025-06-25-generative-apis-changed-llama-33-70b-maximum-context-up.mdx
@@ -0,0 +1,12 @@
+---
+title: Llama 3.3 70B maximum context update
+status: changed
+date: 2025-06-25
+category: ai-data
+product: generative-apis
+---
+
+Llama 3.3 70B maximum context is [now reduced to 100k tokens](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/) (from 130k tokens previously). 
+This update will improve average throughput and time to first token. 
+[Managed Inference](https://www.scaleway.com/en/docs/managed-inference/reference-content/model-catalog/) can still be used to support context lengths of 130k tokens.
+