feat(genapi): update troubleshooting

fpagny · web-flow · commit 70bcc35a4a99 · 2025-10-10T17:23:34.000+02:00
Remove duplicated content, and add links to IDEs configuration for maximum context window reached.
diff --git a/pages/generative-apis/troubleshooting/fixing-common-issues.mdx b/pages/generative-apis/troubleshooting/fixing-common-issues.mdx
@@ -17,6 +17,7 @@ Below are common issues that you may encounter when using Generative APIs, their
 
 ### Solution
 - Reduce your input size below what is [supported by the model](/generative-apis/reference-content/supported-models/).
+  - If you are using a third party tool such as IDEs, you should edit their configuration to set an appropriate maximum context window for the model. More information for [VS Code (Continue)](/generative-apis/reference-content/adding-ai-to-vscode-using-continue/#configure-continue-through-a-configuration-file), [IntelliJ (Continue)](/generative-apis/reference-content/adding-ai-to-intellij-using-continue/#configure-continue-through-configuration-file) and [Zed](/generative-apis/reference-content/adding-ai-to-zed-ide/).
 - Use a model supporting longer context window values.
 - Use [Managed Inference](/managed-inference/), where the context window can be increased for [several configurations with additional GPU vRAM](/managed-inference/reference-content/supported-models/). For instance, `llama-3.3-70b-instruct` model in `fp8` quantization can be served with:
   - `15k` tokens context window on `H100` Instances
@@ -51,20 +52,6 @@ Below are common issues that you may encounter when using Generative APIs, their
 
 ## 416: Range Not Satisfiable - max_completion_tokens is limited for this model
 
-### Cause
-- You provided a value for `max_completion_tokens` that is too high and not supported by the model you are using.
-
-### Solution
-- Remove `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
-  - As an example, when using the [init_chat_model from Langchain](https://python.langchain.com/api_reference/_modules/langchain/chat_models/base.html#init_chat_model), you should edit the `max_tokens` value in the following configuration:
-    ```python
-    llm = init_chat_model("llama-3.3-70b-instruct", max_tokens="8000", model_provider="openai", base_url="https://api.scaleway.ai/v1", temperature=0.7)
-    ```
-- Use a model supporting higher `max_completion_tokens` value.
-- Use [Managed Inference](/managed-inference/), where these limits on completion tokens do not apply (your completion tokens amount will still be limited by the maximum context window supported by the model).
-
-## 416: Range Not Satisfiable - max_completion_tokens is limited for this model
-
 ### Cause
 - You provided a value for `max_completion_tokens` which is too high, and not supported by the model you are using.