From 1dd48c91ebb0da30322cb8c5e9b8aff130c2792b Mon Sep 17 00:00:00 2001 From: fpagny Date: Thu, 5 Jun 2025 11:52:06 +0200 Subject: [PATCH 1/2] feat(genapi): Update faq about maximum output tokens --- pages/generative-apis/faq.mdx | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/pages/generative-apis/faq.mdx b/pages/generative-apis/faq.mdx index bcaf657655..b702c9624d 100644 --- a/pages/generative-apis/faq.mdx +++ b/pages/generative-apis/faq.mdx @@ -117,6 +117,13 @@ To get started, explore the [Generative APIs Playground](/generative-apis/quicks Yes, API rate limits define the maximum number of requests a user can make within a specific time frame to ensure fair access and resource allocation between users. If you require increased rate limits (by a factor from 2 to 5 times), you can request them by [creating a ticket](https://console.scaleway.com/support/tickets/create). If you require even higher rate limits, especially to absorb infrequent peak loads, we recommend using [Managed Inference](https://console.scaleway.com/inference/deployments) instead with dedicated provisioned capacity. Refer to our dedicated [documentation](/generative-apis/reference-content/rate-limits/) for more information on rate limits. +## Can I increase maximum output (completion) tokens for a model? +No, you cannot increase maximum output tokens above [limits for each models](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/) in Generative APIs. +These limits are in place to protect you against: +- Long generation which may be ended by an HTTP timeout. Limits are designed to ensure a model will send its HTTP response in less than 5 minutes. +- Uncontrolled billing, as several models are known to be able to enter infinite generation loops (specific prompts can make the model generate the same sentence over and over, without stopping at all). +If you require higher maximum output tokens, you can use [Managed Inference](https://console.scaleway.com/inference/deployments) where these limts to not apply (as your bill will be limited by the size of your deployment). + ## What is the model lifecycle for Generative APIs? Scaleway is dedicated to updating and offering the latest versions of generative AI models, ensuring improvements in capabilities, accuracy, and safety. As new versions of models are introduced, you can explore them through the Scaleway console. Learn more in our dedicated [documentation](/generative-apis/reference-content/model-lifecycle/). From 53e8025d35cd605e754a359308f8de8f07fc9d36 Mon Sep 17 00:00:00 2001 From: Benedikt Rollik Date: Thu, 5 Jun 2025 15:30:41 +0200 Subject: [PATCH 2/2] Update pages/generative-apis/faq.mdx Co-authored-by: Rowena Jones <36301604+RoRoJ@users.noreply.github.com> --- pages/generative-apis/faq.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pages/generative-apis/faq.mdx b/pages/generative-apis/faq.mdx index b702c9624d..2228ebbdeb 100644 --- a/pages/generative-apis/faq.mdx +++ b/pages/generative-apis/faq.mdx @@ -122,7 +122,7 @@ No, you cannot increase maximum output tokens above [limits for each models](htt These limits are in place to protect you against: - Long generation which may be ended by an HTTP timeout. Limits are designed to ensure a model will send its HTTP response in less than 5 minutes. - Uncontrolled billing, as several models are known to be able to enter infinite generation loops (specific prompts can make the model generate the same sentence over and over, without stopping at all). -If you require higher maximum output tokens, you can use [Managed Inference](https://console.scaleway.com/inference/deployments) where these limts to not apply (as your bill will be limited by the size of your deployment). +If you require higher maximum output tokens, you can use [Managed Inference](https://console.scaleway.com/inference/deployments) where these limts do not apply (as your bill will be limited by the size of your deployment). ## What is the model lifecycle for Generative APIs? Scaleway is dedicated to updating and offering the latest versions of generative AI models, ensuring improvements in capabilities, accuracy, and safety. As new versions of models are introduced, you can explore them through the Scaleway console. Learn more in our dedicated [documentation](/generative-apis/reference-content/model-lifecycle/).