Skip to content

Commit f84134c

Browse files
fpagnybene2k1
authored andcommitted
Update rate-limits.mdx (#4516)
1 parent 1749b46 commit f84134c

File tree

1 file changed

+12
-12
lines changed

1 file changed

+12
-12
lines changed

pages/generative-apis/reference-content/rate-limits.mdx

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -21,28 +21,28 @@ Any model served through Scaleway Generative APIs gets limited by:
2121
These limits only apply if you created a Scaleway Account and registered a valid payment method. Otherwise, stricter limits apply to ensure usage stays within Free Tier only.
2222
</Message>
2323

24+
## How can I increase the rate limits?
25+
26+
We actively monitor usage and will improve rates based on feedback.
27+
If you need to increase your rate limits, [contact our support team](https://console.scaleway.com/support/create), providing details on the model used and specific use case.
28+
Note that for increases of up to x5 or x10 volumes, we highly recommend using dedicated deployments with [Managed Inference](https://console.scaleway.com/inference/deployments), which provides exactly the same features and API compatibility.
29+
2430
### Chat models
2531

2632
| Model string | Requests per minute | Total tokens per minute |
2733
|-----------------|-----------------|-----------------|
28-
| `llama-3.1-8b-instruct` | 300 | 100K |
29-
| `llama-3.1-70b-instruct` | 300 | 100K |
30-
| `mistral-nemo-instruct-2407`| 300 | 100K |
31-
| `pixtral-12b-2409`| 300 | 100K |
32-
| `qwen2.5-32b-instruct`| 300 | 100K |
34+
| `llama-3.1-8b-instruct` | 300 | 200K |
35+
| `llama-3.1-70b-instruct` | 300 | 200K |
36+
| `mistral-nemo-instruct-2407`| 300 | 200K |
37+
| `pixtral-12b-2409`| 300 | 200K |
38+
| `qwen2.5-32b-instruct`| 300 | 200K |
3339

3440
### Embedding models
3541

3642
| Model string | Requests per minute | Input tokens per minute |
3743
|-----------------|-----------------|-----------------|
38-
| `sentence-t5-xxl` | 100 | 200K |
39-
| `bge-multilingual-gemma2` | 100 | 200K |
44+
| `bge-multilingual-gemma2` | 300 | 400K |
4045

4146
## Why do we set rate limits?
4247

4348
These limits safeguard against abuse or misuse of Scaleway Generative APIs, helping to ensure fair access to the API with consistent performance.
44-
45-
## How can I increase the rate limits?
46-
47-
We actively monitor usage and will improve rates based on feedback.
48-
If you need to increase your rate limits, contact us via the support team, providing details on the model used and specific use case.

0 commit comments

Comments
 (0)