Skip to content

Commit 971829d

Browse files
bene2k1RoRoJ
andauthored
Apply suggestions from code review
Co-authored-by: Rowena Jones <[email protected]>
1 parent 038f890 commit 971829d

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

pages/generative-apis/troubleshooting/fixing-common-issues.mdx

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,12 @@ Below are common issues that you may encounter when using Generative APIs, their
1717

1818
### Cause
1919
- You provided an input exceeding the maximum context window (also known as context length) for the model you are using.
20-
- You provided a long input and requested a long input (in `max_completion_tokens` field), which added, exceeds the maximum context window of the model you are using.
20+
- You provided a long input and requested a long input (in `max_completion_tokens` field), which added together, exceed the maximum context window of the model you are using.
2121

2222
### Solution
2323
- Reduce your input size below what is [supported by the model](/generative-apis/reference-content/supported-models/).
2424
- Use a model supporting longer context window values.
25-
- Use [Managed Inference](/managed-inference/), where context window can be increased for [several configuration with additional GPU vRAM](/managed-inference/reference-content/supported-models/). For instance, `llama-3.3-70b-instruct` model in `fp8` quantization can be served with:
25+
- Use [Managed Inference](/managed-inference/), where the context window can be increased for [several configurations with additional GPU vRAM](/managed-inference/reference-content/supported-models/). For instance, `llama-3.3-70b-instruct` model in `fp8` quantization can be served with:
2626
- `15k` tokens context window on `H100` instances
2727
- `128k` tokens context window on `H100-2` instances.
2828

@@ -43,7 +43,7 @@ Below are common issues that you may encounter when using Generative APIs, their
4343
## 416: Range Not Satisfiable - max_completion_tokens is limited for this model
4444

4545
### Cause
46-
- You provided `max_completion_tokens` value too high, that is not supported by the model you are using.
46+
- You provided a value for `max_completion_tokens` that is too high and not supported by the model you are using.
4747

4848
### Solution
4949
- Remove `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
@@ -60,12 +60,12 @@ Below are common issues that you may encounter when using Generative APIs, their
6060
- You provided `max_completion_tokens` value too high, that is not supported by the model you are using.
6161

6262
### Solution
63-
- Remove `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
63+
- Remove the `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
6464
- As an example, when using the [init_chat_model from Langchain](https://python.langchain.com/api_reference/_modules/langchain/chat_models/base.html#init_chat_model), you should edit the `max_tokens` value in the following configuration:
6565
```python
6666
llm = init_chat_model("llama-3.3-70b-instruct", max_tokens="8000", model_provider="openai", base_url="https://api.scaleway.ai/v1", temperature=0.7)
6767
```
68-
- Use a model supporting higher `max_completion_tokens` value.
68+
- Use a model supporting a higher `max_completion_tokens` value.
6969
- Use [Managed Inference](/managed-inference/), where these limits on completion tokens do not apply (your completion tokens amount will still be limited by the maximum context window supported by the model).
7070

7171
## 429: Too Many Requests - You exceeded your current quota of requests/tokens per minute

0 commit comments

Comments
 (0)