Skip to content

Commit 36cda4f

Browse files
authored
Update generative-apis.mdx
1 parent 59ced01 commit 36cda4f

File tree

1 file changed

+18
-6
lines changed

1 file changed

+18
-6
lines changed

faq/generative-apis.mdx

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,19 @@ Scaleway's Generative APIs provide access to pre-configured, serverless endpoint
1515

1616
## Which models are supported by Generative APIs?
1717
Our Generative APIs support a range of popular models, including:
18-
- Instruct models: Refer to our dedicated [documentation](/generative-apis/reference-content/supported-models/#chat-models) for a list of supported chat models.
18+
- Chat / Text Generation models: Refer to our dedicated [documentation](/generative-apis/reference-content/supported-models/#chat-models) for a list of supported chat models.
19+
- Vision models: Refer to our dedicated [documentation](/generative-apis/reference-content/supported-models/#vision-models) for a list of supported vision models.
1920
- Embedding models: Refer to our dedicated [documentation](/generative-apis/reference-content/supported-models/#embedding-models) for a list of supported embedding models.
2021

2122
## How does the free tier work?
22-
The free tier allows you to process up to 1,000,000 tokens without incurring any costs. After reaching this limit, you will be charged per million tokens processed. For more information, refer to our [pricing page](https://www.scaleway.com/en/pricing/model-as-a-service/#generative-apis).
23+
The free tier allows you to process up to 1,000,000 tokens without incurring any costs. After reaching this limit, you will be charged per million tokens processed. Free tier usage is calculated by adding all input and output tokens consumed from all models used.
24+
For more information, refer to our [pricing page](https://www.scaleway.com/en/pricing/model-as-a-service/#generative-apis).
25+
26+
## How can I monitor my token consumption?
27+
You can see your token consumption in Scaleway Cockpit. You can access it from Scaleway Console under the [Metrics tab](https://console.scaleway.com/generative-api/metrics).
28+
Note that:
29+
- Cockpits are isolated by Projects, hence you first need to select the right project in Scaleway Console before accessing Cockpit to see your token consumption for this project (you can see the project_id in Cockpit URL: `https://{project_id}.dashboard.obs.fr-par.scw.cloud/`.
30+
- Cockpit graphs can take up to 1 hour to update token consumption, see [Troubleshooting](https://www.scaleway.com/en/docs/generative-apis/troubleshooting/fixing-common-issues/#tokens-consumption-is-not-displayed-in-cockpit-metrics) for further details.
2331

2432
## How can I access and use the Generative APIs?
2533
Access is open to all Scaleway customers. You can start by using the Generative APIs Playground in the Scaleway console to experiment with different models. For integration into applications, you can use the OpenAI-compatible APIs provided by Scaleway. Detailed instructions are available in our [Quickstart guide](/generative-apis/quickstart/).
@@ -34,14 +42,15 @@ You can find the privacy policy applicable to all use of Generative APIs [here](
3442
Yes, Scaleway's Generative APIs are designed to be compatible with OpenAI libraries and SDKs, including the OpenAI Python client library and LangChain SDKs. This allows for seamless integration with existing workflows.
3543

3644
## What is the difference between Generative APIs and Managed Inference?
37-
- **Generative APIs**: A serverless service providing access to pre-configured AI models via API, billed per token usage.
38-
- **Managed Inference**: Allows deployment of curated or custom models with chosen quantization and instances, offering predictable throughput and enhanced security features like private network isolation and access control.
45+
- **Generative APIs**: A serverless service providing access to pre-configured AI models via API, billed per token usage.
46+
- **Managed Inference**: Allows deployment of curated or custom models with chosen quantization and instances, offering predictable throughput and enhanced security features like private network isolation and access control. Managed Inference is billed by hourly usage, whether provisioned capacity is receiving traffic or not.
3947

4048
## How do I get started with Generative APIs?
4149
To get started, explore the [Generative APIs Playground](/generative-apis/quickstart/#start-with-the-generative-apis-playground) in the Scaleway console. For application integration, refer to our [Quickstart guide](/generative-apis/quickstart/), which provides step-by-step instructions on accessing, configuring, and using a Generative APIs endpoint.
4250

4351
## Are there any rate limits for API usage?
44-
Yes, API rate limits define the maximum number of requests a user can make within a specific time frame to ensure fair access and resource allocation. Refer to our dedicated [documentation](/generative-apis/reference-content/rate-limits/) for more information on rate limits.
52+
Yes, API rate limits define the maximum number of requests a user can make within a specific time frame to ensure fair access and resource allocation between users. If you require increased rate limits (by a factor from 2 to 5 times), you can request them by [creating a ticket](https://console.scaleway.com/support/tickets/create). If you require even higher rate limits, especially to absorb infrequent peak loads, we recommend to use [Managed Inference](https://console.scaleway.com/inference/deployments) instead with dedicated provisioned capacity.
53+
Refer to our dedicated [documentation](/generative-apis/reference-content/rate-limits/) for more information on rate limits.
4554

4655
## What is the model lifecycle for Generative APIs?
4756
Scaleway is dedicated to updating and offering the latest versions of generative AI models, ensuring improvements in capabilities, accuracy, and safety. As new versions of models are introduced, you can explore them through the Scaleway console. Learn more in our dedicated [documentation](/generative-apis/reference-content/model-lifecycle/).
@@ -50,4 +59,7 @@ Scaleway is dedicated to updating and offering the latest versions of generative
5059
We are currently working on defining our SLAs for Generative APIs. We will provide more information on this topic soon.
5160

5261
## What are the Performance guarantees (vs Managed Inference)?
53-
We are currently working on defining our performance guarantees for Generative APIs. We will provide more information on this topic soon.
62+
We are currently working on defining our performance guarantees for Generative APIs. We will provide more information on this topic soon.
63+
64+
## Do model license apply when using Generative APIs?
65+
Yes, you need to comply with model licenses when using Generative APIs. Applicable licenses are available for [each model in our documentation](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/#vision-models) and in Console Playground.

0 commit comments

Comments
 (0)