You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/my-website/docs/proxy/dynamic_rate_limit.md
+8-4Lines changed: 8 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,12 @@
1
1
2
2
# Dynamic TPM/RPM Allocation
3
3
4
-
Prevent projects from gobbling too much tpm/rpm. You should use this feature when you want to reserve tpm/rpm capacity for specific projects. For example, a realtime use case should get higher priority than a different use case.
4
+
Prevent projects from gobbling too much tpm/rpm.
5
5
6
6
Dynamically allocate TPM/RPM quota to api keys, based on active keys in that minute. [**See Code**](https://github.com/BerriAI/litellm/blob/9bffa9a48e610cc6886fc2dce5c1815aeae2ad46/litellm/proxy/hooks/dynamic_rate_limiter.py#L125)
7
7
8
+
## Quick Start Usage
9
+
8
10
1. Setup config.yaml
9
11
10
12
```yaml showLineNumbers title="config.yaml"
@@ -97,15 +99,17 @@ This was rate limited b/c - Error code: 429 - {'error': {'message': {'error': 'K
97
99
```
98
100
99
101
100
-
#### ✨ [BETA] Set Priority / Reserve Quota
102
+
## [BETA] Set Priority / Reserve Quota
103
+
104
+
Reserve tpm/rpm capacity for projects in prod. You should use this feature when you want to reserve tpm/rpm capacity for specific projects. For example, a realtime use case should get higher priority than a different use case.
101
105
102
-
Reserve tpm/rpm capacity for projects in prod.
103
106
104
107
:::tip
105
108
106
109
Reserving tpm/rpm on keys based on priority is a premium feature. Please [get an enterprise license](./enterprise.md) for it.
0 commit comments