You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ai-data/generative-apis/troubleshooting/fixing-common-issues.mdx
+16-3Lines changed: 16 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,11 +13,24 @@ dates:
13
13
14
14
Below are common issues that you may encounter when using Generative APIs, their causes, and recommended solutions.
15
15
16
-
## 504: Timeout
16
+
## 429: Too Many Request - You exceeded your current quota of requests/tokens per minute
17
17
18
18
### Cause
19
-
- The query is too long.
20
-
- The model goes into an infinite loop while processing the input.
19
+
- You performed too many API requests over a given minute
20
+
- You consumed too much tokens (input and output) with your API requests over a given minute
21
+
22
+
### Solution
23
+
-[Ask our support](https://console.scaleway.com/support/tickets/create) to raise your quota
24
+
- Smooth out your API requests rate by limiting the number of API requests you perform in parallel
25
+
- Reduce the size of the input or output tokens processed by your API requests
26
+
- Use [Managed Inference](/ai-data/managed-inference/), where these quota do not apply (your throughput will be only limited by the amount of Inference Deployment your provision)
27
+
28
+
29
+
## 504: Gateway Timeout
30
+
31
+
### Cause
32
+
- The query is too long to process (even if context-length stays [between supported context window and maximum tokens](https://www.scaleway.com/en/docs/ai-data/generative-apis/reference-content/supported-models/))
33
+
- The model goes into an infinite loop while processing the input (which is a known structural issue with several AI models)
21
34
22
35
### Solution
23
36
- Set a stricter **maximum token limit** to prevent overly long responses.
0 commit comments