You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pages/generative-apis/troubleshooting/fixing-common-issues.mdx
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: Fixing common issues with Generative APIs
3
3
description: This page lists common issues that you may encounter while using Scaleway's Generative APIs, their causes and recommended solutions.
4
4
tags: generative-apis ai-data common-issues
5
5
dates:
6
-
validation: 2025-01-16
6
+
validation: 2025-07-21
7
7
posted: 2025-01-16
8
8
---
9
9
@@ -32,7 +32,7 @@ Below are common issues that you may encounter when using Generative APIs, their
32
32
- You can store your content in a file with the `.json` extension (eg. named `file.json`), and open it with an IDE such as VSCode or Zed. Syntax errors should display if there are any.
33
33
- You can copy your content in a JSON formatter tool or linter available online, that will identify errors.
34
34
- Usually, most common errors include:
35
-
- Missing or unecessary quotes `"`, `'` or commas `,` on properties name and string values.
35
+
- Missing or unnecessary quotes `"`, `'` or commas `,` on property names and string values.
36
36
- Special characters that are not escaped, such as line break `\n` or backslash `\\`
37
37
38
38
## 403: Forbidden - Insufficient permissions to access the resource
@@ -66,7 +66,7 @@ Below are common issues that you may encounter when using Generative APIs, their
66
66
## 416: Range Not Satisfiable - max_completion_tokens is limited for this model
67
67
68
68
### Cause
69
-
- You provided `max_completion_tokens`value too high, which is not supported by the model you are using.
69
+
- You provided a value for `max_completion_tokens`which is too high, and not supported by the model you are using.
70
70
71
71
### Solution
72
72
- Remove the `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
@@ -80,11 +80,11 @@ Below are common issues that you may encounter when using Generative APIs, their
80
80
## 429: Too Many Requests - You exceeded your current quota of requests/tokens per minute
81
81
82
82
### Cause
83
-
- You performed too many API requests over a given minute
83
+
- You performed too many API requests within a given minute
84
84
- You consumed too many tokens (input and output) with your API requests over a given minute
85
85
86
86
### Solution
87
-
- Smooth out your API requests rate by limiting the number of API requests you perform over a given minute so that you remain below your [Organization quotas for Generative APIs](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
87
+
- Smooth out your API requests rate by limiting the number of API requests you perform over a given minute, so that you remain below your [Organization quotas for Generative APIs](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
88
88
-[Add a payment method](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) and [validate your identity](/account/how-to/verify-identity/) to increase automatically your quotas [based on standard limits](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
89
89
- Reduce the size of the input or output tokens processed by your API requests.
90
90
- Use [Managed Inference](/managed-inference/), where these quotas do not apply (your throughput will be only limited by the amount of Inference Deployment your provision)
@@ -97,7 +97,7 @@ Below are common issues that you may encounter when using Generative APIs, their
97
97
98
98
### Solution
99
99
- Smooth out your API requests rate by limiting the number of API requests you perform at the same time (eg. requests which did not receive a complete response and are still opened) so that you remain below your [Organization quotas for Generative APIs](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
100
-
- Use [Managed Inference](/managed-inference/), where concurrent request limit do not apply. Note that exceeding the number of concurrent requests your Inference Deployment can handle may impact performance metrics.
100
+
- Use [Managed Inference](/managed-inference/), where concurrent request limit do not apply. Note that exceeding the number of concurrent requests your Inference deployment can handle may impact performance metrics.
101
101
102
102
103
103
## 504: Gateway Timeout
@@ -117,7 +117,7 @@ For queries where the model enters an infinite loop (more frequent when using **
117
117
- Ensure the `top_p` parameter is not set too low (we recommend the default value of `1`).
118
118
- Add a `presence_penalty` value in your request (`0.5` is a good starting value). This option will help the model choose different tokens than the one it is looping on, although it might impact accuracy for some tasks requiring repeated multiple similar outputs.
119
119
- Use more recent models, which are usually more optimized to avoid loops, especially when using structured output.
120
-
- Optimize the system prompt to provide clearer and simpler tasks. Currently, JSON output accuracy still relies on heuristics to constrain models to output only valid JSON tokens, and thus depends on the prompts given. As a counter-example, providing contradictory requirements to a model - such as `Never output JSON` in the system prompt and `response_format` as `json_schema" in the query - may lead to the model never outputting closing JSON brackets `}`.
120
+
- Optimize the system prompt to provide clearer and simpler tasks. Currently, JSON output accuracy still relies on heuristics to constrain models to output only valid JSON tokens, and thus depends on the prompts given. As a counter-example, providing contradictory requirements to a model - such as `Never output JSON` in the system prompt and `response_format` as `json_schema` in the query - may lead to the model never outputting closing JSON brackets `}`.
121
121
122
122
## Structured output (e.g., JSON) is not working correctly
123
123
@@ -181,7 +181,7 @@ For queries where the model enters an infinite loop (more frequent when using **
181
181
- Counter for **Tokens Processed** or **API Requests** should display a correct value (different from 0)
182
182
- Graph across time should be empty
183
183
184
-
## Embeddings vectors cannot be stored in a database or used with a third-party library
184
+
## Embedding vectors cannot be stored in a database or used with a third-party library
185
185
186
186
### Cause
187
187
The embedding model you are using generates vector representations with a fixed dimension number, which is too high for your database or third-party library.
Copy file name to clipboardExpand all lines: pages/load-balancer/concepts.mdx
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: Load Balancers - Concepts
3
3
description: Learn the key concepts of Scaleway Load Balancer - optimize traffic distribution, ensure high availability, and enhance application performance.
Copy file name to clipboardExpand all lines: pages/vpc/reference-content/use-case-basic.mdx
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: VPC use case 1 - Basic infrastructure to leverage VPC isolation
3
3
description: Learn how to set up a basic infrastructure using VPC isolation for secure cloud environments. Step-by-step guidance on leveraging VPCs for optimal network isolation.
0 commit comments