Skip to content

Commit ab92861

Browse files
julien-cVaibhavs10Wauplin
authored
Apply suggestions from code review
Co-authored-by: vb <[email protected]> Co-authored-by: Lucain <[email protected]>
1 parent 58c1e5e commit ab92861

File tree

1 file changed

+16
-13
lines changed

1 file changed

+16
-13
lines changed

docs/hub/rate-limits.md

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
11
# Hub Rate limits
22

3-
To protect our platform's integrity and make sure we are able to scale our service to as many AI community members as possible, we enforce rate limits on all requests made to the HF Hub.
3+
To protect our platform's integrity and ensure availability to as many AI community members as possible, we enforce rate limits on all requests made to the Hugging Face Hub.
44

55
We define different rate limits for distinct classes of requests. We distinguish three main buckets:
66

77
- **Hub APIs**
88
- e.g. model or dataset search, repo creation, user management, etc. Note: all those endpoints are documented in [Hub API Endpoints](./api).
99
- **Resolvers**
1010
- They're all the URLs that contain a `/resolve/` segment in their path, which serve user-generated content from the Hub. Concretely, those are the URLs that are constructed by open source libraries (transformers, datasets, vLLM, llama.cpp, …) or AI applications (LM Studio, Jan, ollama, …) to download model/dataset files from HF.
11-
- Because they are very heavily used by the community, and because we optimize our infrastructure to serve them with high efficiency, rate limits for Resolvers are the highest ones.
11+
- Resolve requests are heavily used by the community, and since we optimize our infrastructure to serve them with maximum efficiency, the rate limits for Resolvers are the highest.
1212
- **Pages**
13-
- All the Web pages we host on huggingface.co. Usually Web browsing is browsing made by humans hence rate limits don't need to be as high as programmatic endpoints like the two former buckets.
13+
- All the Web pages we host on huggingface.co.
14+
- Usually Web browsing requests are made by humans, hence rate limits don't need to be as high as the above mentioned programmatic endpoints.
1415

1516
> [!TIP]
1617
> All values are defined over 5-minute windows, which allows for some level of "burstiness" from an application or developer's point of view.
@@ -19,15 +20,15 @@ If you, your organization, or your application need higher rate limits, we encou
1920

2021
## Billing dashboard
2122

22-
At any point in time, you can check your rate limit status on your (or your org’s) Billing page: https://huggingface.co/settings/billing
23+
At any point, you can check your rate limit status on your (or your org’s) Billing page: https://huggingface.co/settings/billing
2324

2425
![dashboard for rate limits](https://cdn-uploads.huggingface.co/production/uploads/5dd96eb166059660ed1ee413/0pzQQyuVG3c9tWjCqrX9Y.png)
2526

26-
On the right side, you will see three gauges, one for each bucket of Rate limiting.
27+
On the right side, you will see three gauges, one for each bucket of Requests.
2728

2829
Each bucket presents the number of current (last 5 minutes) requests, and the number of allowed requests based on your user account or organization plan.
2930

30-
Whenever you are above the limit in the past 5 minutes (the view is updated in real-time), the bar will turn red.
31+
Whenever you exceed the limit in the past 5 minutes (the view is updated in real-time), the bar will turn red.
3132

3233
Note: You can use the context switcher to easily switch between your user account and your orgs.
3334

@@ -43,14 +44,14 @@ Precisely, we implement the following headers:
4344

4445
| Header | Purpose / Meaning |
4546
| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
46-
| **`RateLimit-Policy`** | Carries the rate limit policy itself (e.g. “100 requests per 5 minutes”). It’s informative; shows what policy the client is subject to. |
47-
| **`RateLimit-Limit`** | The total allowed rate limit for the current window. “How many requests (of this type) you’re allowed.” |
48-
| **`RateLimit-Remaining`** | How many requests of this type you have left in the current window. |
49-
| **`RateLimit-Reset`** | Number of seconds until the rate limit window resets (or until quota is refreshed). Uses a “delta-seconds” format to reduce clock sync issues. |
47+
| **`RateLimitPolicy`** | Carries the rate limit policy itself (e.g. “100 requests per 5 minutes”). It’s informative; shows what policy the client is subject to. |
48+
| **`RateLimitLimit`** | The total allowed rate limit for the current window. “How many requests (of this type) you’re allowed.” |
49+
| **`RateLimitRemaining`** | How many requests of this type you have left in the current window. |
50+
| **`RateLimitReset`** | Number of seconds until the rate limit window resets (or until quota is refreshed). Uses a “delta-seconds” format to reduce clock sync issues. |
5051

5152
## Rate limit Tiers
5253

53-
Here are the current rate limiting values (in September '25) based on your plan:
54+
Here are the current rate limits (in September '25) based on your plan:
5455

5556
| Plan | API | Resolvers | Pages |
5657
| ------------------------------------------------------------------------- | -------- | --------- | ------ |
@@ -65,13 +66,15 @@ Here are the current rate limiting values (in September '25) based on your plan:
6566

6667
\* Anonymous and Free users are subject to change over time depending on platform health 🤞
6768

69+
Note: For organizations, rate limits are applied individually to each member.
70+
6871
## What if I get rate-limited
6972

70-
First, make sure you are always passing a `HF_TOKEN`, and it gets passed downstream to all libraries or applications you might be using that downloads _stuff_ from the Hub.
73+
First, make sure you always pass a `HF_TOKEN`, and it is passed downstream to all libraries or applications that downloads _stuff_ from the Hub.
7174

7275
This is the number one reason users get rate limited and is a very easy fix.
7376

74-
If you're sure you're passing a `HF_TOKEN`, you can:
77+
Despite passing `HF_TOKEN` if you are still rate limited, you can:
7578

7679
- spread out your requests over longer periods of time
7780
- replace Hub API calls with Resolver calls, whenever possible (Resolver rate limits are much higher and much more optimized).

0 commit comments

Comments
 (0)