Skip to content

Commit 70bcc35

Browse files
authored
feat(genapi): update troubleshooting
Remove duplicated content, and add links to IDEs configuration for maximum context window reached.
1 parent 10f115f commit 70bcc35

File tree

1 file changed

+1
-14
lines changed

1 file changed

+1
-14
lines changed

pages/generative-apis/troubleshooting/fixing-common-issues.mdx

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Below are common issues that you may encounter when using Generative APIs, their
1717

1818
### Solution
1919
- Reduce your input size below what is [supported by the model](/generative-apis/reference-content/supported-models/).
20+
- If you are using a third party tool such as IDEs, you should edit their configuration to set an appropriate maximum context window for the model. More information for [VS Code (Continue)](/generative-apis/reference-content/adding-ai-to-vscode-using-continue/#configure-continue-through-a-configuration-file), [IntelliJ (Continue)](/generative-apis/reference-content/adding-ai-to-intellij-using-continue/#configure-continue-through-configuration-file) and [Zed](/generative-apis/reference-content/adding-ai-to-zed-ide/).
2021
- Use a model supporting longer context window values.
2122
- Use [Managed Inference](/managed-inference/), where the context window can be increased for [several configurations with additional GPU vRAM](/managed-inference/reference-content/supported-models/). For instance, `llama-3.3-70b-instruct` model in `fp8` quantization can be served with:
2223
- `15k` tokens context window on `H100` Instances
@@ -51,20 +52,6 @@ Below are common issues that you may encounter when using Generative APIs, their
5152

5253
## 416: Range Not Satisfiable - max_completion_tokens is limited for this model
5354

54-
### Cause
55-
- You provided a value for `max_completion_tokens` that is too high and not supported by the model you are using.
56-
57-
### Solution
58-
- Remove `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
59-
- As an example, when using the [init_chat_model from Langchain](https://python.langchain.com/api_reference/_modules/langchain/chat_models/base.html#init_chat_model), you should edit the `max_tokens` value in the following configuration:
60-
```python
61-
llm = init_chat_model("llama-3.3-70b-instruct", max_tokens="8000", model_provider="openai", base_url="https://api.scaleway.ai/v1", temperature=0.7)
62-
```
63-
- Use a model supporting higher `max_completion_tokens` value.
64-
- Use [Managed Inference](/managed-inference/), where these limits on completion tokens do not apply (your completion tokens amount will still be limited by the maximum context window supported by the model).
65-
66-
## 416: Range Not Satisfiable - max_completion_tokens is limited for this model
67-
6855
### Cause
6956
- You provided a value for `max_completion_tokens` which is too high, and not supported by the model you are using.
7057

0 commit comments

Comments
 (0)