Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 6 additions & 8 deletions pages/generative-apis/troubleshooting/fixing-common-issues.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ Below are common issues that you may encounter when using Generative APIs, their
- You consumed too many tokens (input and output) with your API requests over a given minute

### Solution
- Smooth out your API requests rate by limiting the number of API requests you perform over a given minute so that you remain below your [Organization quotas for Generative APIs](/en/docs/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
- [Add a payment method](/en/docs/billing/how-to/add-payment-method/#how-to-add-a-credit-card) and [validate your identity](/en/docs/account/how-to/verify-identity/) to increase automatically your quotas [based on standard limits](/en/docs/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
- Smooth out your API requests rate by limiting the number of API requests you perform over a given minute so that you remain below your [Organization quotas for Generative APIs](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
- [Add a payment method](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) and [validate your identity](/account/how-to/verify-identity/) to increase automatically your quotas [based on standard limits](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
- [Ask our support](https://console.scaleway.com/support/tickets/create) to raise your quota.
- Reduce the size of the input or output tokens processed by your API requests.
- Use [Managed Inference](/managed-inference/), where these quota do not apply (your throughput will be only limited by the amount of Inference Deployment your provision)
Expand Down Expand Up @@ -77,8 +77,8 @@ Below are common issues that you may encounter when using Generative APIs, their
### Solution
- Ensure the proper field `"response_format"` is used in the query.
- Provide a JSON schema in the request to guide the model's structured output.
- Ensure the `max_tokens` value is higher than the response `completion_tokens` value. If this is not the case, the model answer may be stripped down before it can finish the proper JSON structure (and lack closing JSON brackets `}` for example). Note that if the `max_tokens` value is not set in the query, [default values apply for each models](/generative-apis/reference-content/supported-models/).
- Ensure the `temperature` value is set with a lower range value for the model. If this is not the case, the model answer may output invalid JSON characters. Note that if the `temperature` value is not set in the query, [default values apply for each models](/generative-apis/reference-content/supported-models/). As examples:
- Ensure the `max_tokens` value is higher than the response `completion_tokens` value. If this is not the case, the model answer may be stripped down before it can finish the proper JSON structure (and lack closing JSON brackets `}` for example). Note that if the `max_tokens` value is not set in the query, [default values apply for each model](/generative-apis/reference-content/supported-models/).
- Ensure the `temperature` value is set with a lower range value for the model. If this is not the case, the model answer may output invalid JSON characters. Note that if the `temperature` value is not set in the query, [default values apply for each model](/generative-apis/reference-content/supported-models/). As examples:
- for `llama-3.3-70b-instruct`, `temperature` should be set lower than `0.6`
- for `mistral-nemo-instruct-2407 `, `temperature` should be set lower than `0.3`
- Refer to the [documentation on structured outputs](/generative-apis/how-to/use-structured-outputs/) for examples and additional guidance.
Expand Down Expand Up @@ -108,7 +108,7 @@ Below are common issues that you may encounter when using Generative APIs, their

### Causes
- Cockpit is isolated by `project_id` and only displays token consumption related to one Project.
- Cockpit `Tokens Processed` graphs along time can take up to an hour to update (to provide more accurate average consumptions over time). The overall `Tokens Processed` counter is updated in real time.
- Cockpit `Tokens Processed` graphs along time can take up to an hour to update (to provide more accurate average consumptions over time). The overall `Tokens Processed` counter is updated in real-time.

### Solution
- Ensure you are connecting to the Cockpit corresponding to your Project. Cockpits are currently isolated by `project_id`, which you can see in their URL: `https://PROJECT_ID.dashboard.obs.fr-par.scw.cloud/`. This Project should correspond to the one used in the URL you used to perform Generative APIs requests, such as `https://api.scaleway.ai/{PROJECT_ID}/v1/chat/completions`. You can list your projects and their IDs in your [Organization dashboard](https://console.scaleway.com/organization/projects).
Expand All @@ -135,6 +135,4 @@ Below are common issues that you may encounter when using Generative APIs, their
### Debugging silent errors
- For cases where no explicit error is returned:
- Verify all fields in the API request are correctly named and formatted.
- Test the request with smaller and simpler inputs to isolate potential issues.


- Test the request with smaller and simpler inputs to isolate potential issues.