You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pages/generative-apis/troubleshooting/fixing-common-issues.mdx
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,12 +17,12 @@ Below are common issues that you may encounter when using Generative APIs, their
17
17
18
18
### Cause
19
19
- You provided an input exceeding the maximum context window (also known as context length) for the model you are using.
20
-
- You provided a long input and requested a long input (in `max_completion_tokens` field), which added, exceeds the maximum context window of the model you are using.
20
+
- You provided a long input and requested a long input (in `max_completion_tokens` field), which added together, exceed the maximum context window of the model you are using.
21
21
22
22
### Solution
23
23
- Reduce your input size below what is [supported by the model](/generative-apis/reference-content/supported-models/).
24
24
- Use a model supporting longer context window values.
25
-
- Use [Managed Inference](/managed-inference/), where context window can be increased for [several configuration with additional GPU vRAM](/managed-inference/reference-content/supported-models/). For instance, `llama-3.3-70b-instruct` model in `fp8` quantization can be served with:
25
+
- Use [Managed Inference](/managed-inference/), where the context window can be increased for [several configurations with additional GPU vRAM](/managed-inference/reference-content/supported-models/). For instance, `llama-3.3-70b-instruct` model in `fp8` quantization can be served with:
26
26
-`15k` tokens context window on `H100` instances
27
27
-`128k` tokens context window on `H100-2` instances.
28
28
@@ -43,7 +43,7 @@ Below are common issues that you may encounter when using Generative APIs, their
43
43
## 416: Range Not Satisfiable - max_completion_tokens is limited for this model
44
44
45
45
### Cause
46
-
- You provided `max_completion_tokens`value too high, that is not supported by the model you are using.
46
+
- You provided a value for `max_completion_tokens`that is too high and not supported by the model you are using.
47
47
48
48
### Solution
49
49
- Remove `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
@@ -60,12 +60,12 @@ Below are common issues that you may encounter when using Generative APIs, their
60
60
- You provided `max_completion_tokens` value too high, that is not supported by the model you are using.
61
61
62
62
### Solution
63
-
- Remove `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
63
+
- Remove the `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
64
64
- As an example, when using the [init_chat_model from Langchain](https://python.langchain.com/api_reference/_modules/langchain/chat_models/base.html#init_chat_model), you should edit the `max_tokens` value in the following configuration:
- Use a model supporting higher `max_completion_tokens` value.
68
+
- Use a model supporting a higher `max_completion_tokens` value.
69
69
- Use [Managed Inference](/managed-inference/), where these limits on completion tokens do not apply (your completion tokens amount will still be limited by the maximum context window supported by the model).
70
70
71
71
## 429: Too Many Requests - You exceeded your current quota of requests/tokens per minute
0 commit comments