Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
---
meta:
title: Fixing common issues with Generative APIs
description: This page lists common issues that you may encounter while using Scaleway's Generative APIs, their causes and recommended solutions.
content:
h1: Fixing common issues with Generative APIs
paragraph: Generative APIs offer serverless AI models hosted at Scaleway - no need to configure hardware or deploy your own models
tags: generative-apis ai-data common-issues
dates:
validation: 2025-01-16
posted: 2025-01-16
---

Below are common issues that you may encounter when using Generative APIs, their causes, and recommended solutions.

## 429: Too Many Request - You exceeded your current quota of requests/tokens per minute

### Cause
- You performed too many API requests over a given minute
- You consumed too much tokens (input and output) with your API requests over a given minute

### Solution
- [Ask our support](https://console.scaleway.com/support/tickets/create) to raise your quota
- Smooth out your API requests rate by limiting the number of API requests you perform in parallel
- Reduce the size of the input or output tokens processed by your API requests
- Use [Managed Inference](/ai-data/managed-inference/), where these quota do not apply (your throughput will be only limited by the amount of Inference Deployment your provision)


## 504: Gateway Timeout

### Cause
- The query is too long to process (even if context-length stays [between supported context window and maximum tokens](https://www.scaleway.com/en/docs/ai-data/generative-apis/reference-content/supported-models/))
- The model goes into an infinite loop while processing the input (which is a known structural issue with several AI models)

### Solution
- Set a stricter **maximum token limit** to prevent overly long responses.
- Reduce the size of the input tokens, or split the input into multiple API requests.
- Use [Managed Inference](/ai-data/managed-inference/), where no query timeout is enforced.

## Structured output (e.g., JSON) is not working correctly

### Cause
- Incorrect field naming in the request, such as using `"format"` instead of the correct `"response_format"` field.
- Lack of a JSON schema, which can lead to ambiguity in the output structure.

### Solution
- Ensure the proper field `"response_format"` is used in the query.
- Provide a JSON schema in the request to guide the model's structured output.
- Refer to the [documentation on structured outputs](/ai-data/generative-apis/how-to/use-structured-outputs/) for examples and additional guidance.


## Multiple "role": "user" successive messages

### Cause
- Successive messages with `"role": "user"` are sent in the API request instead of alternating between `"role": "user"` and `"role": "assistant"`.

### Solution
- Ensure the `"messages"` array alternates between `"role": "user"` and `"role": "assistant"`.
- If multiple `"role": "user"` messages need to be sent, concatenate them into one `"role": "user"` message or intersperse them with appropriate `"role": "assistant"` responses.

#### Example error message (for Mistral models)
```json
{
"object": "error",
"message": "After the optional system message, conversation roles must alternate user/assistant/user/assistant/...",
"type": "BadRequestError",
"param": null,
"code": 400
}
```

## Best practices for optimizing model performance

### Input size management
- Avoid overly long input sequences; break them into smaller chunks if needed.
- Use summarization techniques for large inputs to reduce token count while maintaining relevance.

### Use proper parameter configuration
- Double-check parameters like `"temperature"`, `"max_tokens"`, and `"top_p"` to ensure they align with your use case.
- For structured output, always include a `"response_format"` and, if possible, a detailed JSON schema.

### Debugging silent errors
- For cases where no explicit error is returned:
- Verify all fields in the API request are correctly named and formatted.
- Test the request with smaller and simpler inputs to isolate potential issues.


8 changes: 8 additions & 0 deletions ai-data/generative-apis/troubleshooting/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
meta:
title: Generative APIs - Troubleshooting
description: Generative APIs - Troubleshooting
content:
h1: Generative APIs - Troubleshooting
paragraph: Generative APIs - Troubleshooting
---
10 changes: 10 additions & 0 deletions menu/navigation.json
Original file line number Diff line number Diff line change
Expand Up @@ -838,6 +838,16 @@
],
"label": "Additional Content",
"slug": "reference-content"
},
{
"items": [
{
"label": "Fixing common issues",
"slug": "fixing-common-issues"
}
],
"label": "Troubleshooting",
"slug": "troubleshooting"
}
],
"label": "Generative APIs",
Expand Down
Loading