Skip to content

Commit 91838f7

Browse files
fpagnybene2k1ldecarvalho-docRoRoJ
authored
fix(genapi): update quota documentation (#5056)
* fix(genapi): update quota documentation * Apply suggestions from code review * fix(genapi): update lifecycle faq * Apply suggestions from code review Co-authored-by: ldecarvalho-doc <[email protected]> * Apply suggestions from code review Co-authored-by: Rowena Jones <[email protected]> --------- Co-authored-by: Benedikt Rollik <[email protected]> Co-authored-by: ldecarvalho-doc <[email protected]> Co-authored-by: Rowena Jones <[email protected]>
1 parent 117bb5d commit 91838f7

File tree

2 files changed

+6
-3
lines changed

2 files changed

+6
-3
lines changed

pages/generative-apis/faq.mdx

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -114,11 +114,14 @@ Yes, Scaleway's Generative APIs are designed to be compatible with OpenAI librar
114114
To get started, explore the [Generative APIs Playground](/generative-apis/quickstart/#start-with-the-generative-apis-playground) in the Scaleway console. For application integration, refer to our [Quickstart guide](/generative-apis/quickstart/), which provides step-by-step instructions on accessing, configuring, and using a Generative APIs endpoint.
115115

116116
## Are there any rate limits for API usage?
117-
Yes, API rate limits define the maximum number of requests a user can make within a specific time frame to ensure fair access and resource allocation between users. If you require increased rate limits (by a factor from 2 to 5 times), you can request them by [creating a ticket](https://console.scaleway.com/support/tickets/create). If you require even higher rate limits, especially to absorb infrequent peak loads, we recommend using [Managed Inference](https://console.scaleway.com/inference/deployments) instead with dedicated provisioned capacity.
117+
Yes, API rate limits define the maximum number of requests a user can make within a specific time frame to ensure fair access and resource allocation between users. If you require increased rate limits we recommend either:
118+
- Using [Managed Inference](https://console.scaleway.com/inference/deployments), which provides dedicated capacity and doesn't enforce rate limits (you remain limited by the total provisioned capacity)
119+
- Contacting your existing Scaleway account manager or our Sales team to discuss volume commitment for specific models that will allow us to increase your quota proportionally.
120+
118121
Refer to our dedicated [documentation](/generative-apis/reference-content/rate-limits/) for more information on rate limits.
119122

120123
## What is the model lifecycle for Generative APIs?
121-
Scaleway is dedicated to updating and offering the latest versions of generative AI models, ensuring improvements in capabilities, accuracy, and safety. As new versions of models are introduced, you can explore them through the Scaleway console. Learn more in our dedicated [documentation](/generative-apis/reference-content/model-lifecycle/).
124+
Scaleway is dedicated to updating and offering the latest versions of generative AI models, while ensuring older models remain accessible for a significant time, and also ensuring the reliability of your production applications. Learn more in our [model lifecycle policy](/generative-apis/reference-content/model-lifecycle/).
122125

123126
## What are the SLAs applicable to Generative APIs?
124127
We are currently working on defining our SLAs for Generative APIs. We will provide more information on this topic soon.

pages/generative-apis/troubleshooting/fixing-common-issues.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,9 +90,9 @@ Below are common issues that you may encounter when using Generative APIs, their
9090
### Solution
9191
- Smooth out your API requests rate by limiting the number of API requests you perform over a given minute so that you remain below your [Organization quotas for Generative APIs](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
9292
- [Add a payment method](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) and [validate your identity](/account/how-to/verify-identity/) to increase automatically your quotas [based on standard limits](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
93-
- [Ask our support](https://console.scaleway.com/support/tickets/create) to raise your quota.
9493
- Reduce the size of the input or output tokens processed by your API requests.
9594
- Use [Managed Inference](/managed-inference/), where these quotas do not apply (your throughput will be only limited by the amount of Inference Deployment your provision)
95+
- Contact your assigned Scaleway account manager or [our Sales team](https://www.scaleway.com/en/contact-sales/) to discuss volume commitments for specific models, which will enable us to increase your quota proportionally.
9696

9797
## 429: Too Many Requests - You exceeded your current threshold of concurrent requests
9898

0 commit comments

Comments
 (0)