-
Notifications
You must be signed in to change notification settings - Fork 374
Document rate limits #1944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Document rate limits #1944
Changes from 9 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
ef5068a
unrelated
julien-c 0dfcc56
"now" is old by now
julien-c ab5f938
First, a few links
julien-c 5d94182
more links + reorder
julien-c 66f0984
lesssgo
julien-c 6ad429d
Tweaks
julien-c 58c1e5e
Ah i forgot: add a line about E+ with defined IP Range
julien-c ab92861
Apply suggestions from code review
julien-c f873987
Update docs/hub/rate-limits.md
julien-c 83abbcd
add this info block
julien-c 733af34
explicitly link to the Resolver endpoint
julien-c cdb96e1
Update docs/hub/rate-limits.md
julien-c 732cede
Update rate-limits.md
julien-c fac03f2
Update headers spec
julien-c File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| # Hub Rate limits | ||
|
|
||
| To protect our platform's integrity and ensure availability to as many AI community members as possible, we enforce rate limits on all requests made to the Hugging Face Hub. | ||
|
|
||
| We define different rate limits for distinct classes of requests. We distinguish three main buckets: | ||
julien-c marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| - **Hub APIs** | ||
| - e.g. model or dataset search, repo creation, user management, etc. All endpoints that belong to this bucket are documented in [Hub API Endpoints](./api). | ||
| - **Resolvers** | ||
| - They're all the URLs that contain a `/resolve/` segment in their path, which serve user-generated content from the Hub. Concretely, those are the URLs that are constructed by open source libraries (transformers, datasets, vLLM, llama.cpp, …) or AI applications (LM Studio, Jan, ollama, …) to download model/dataset files from HF. | ||
julien-c marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - Resolve requests are heavily used by the community, and since we optimize our infrastructure to serve them with maximum efficiency, the rate limits for Resolvers are the highest. | ||
| - **Pages** | ||
| - All the Web pages we host on huggingface.co. | ||
| - Usually Web browsing requests are made by humans, hence rate limits don't need to be as high as the above mentioned programmatic endpoints. | ||
|
|
||
| > [!TIP] | ||
| > All values are defined over 5-minute windows, which allows for some level of "burstiness" from an application or developer's point of view. | ||
julien-c marked this conversation as resolved.
Show resolved
Hide resolved
julien-c marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| If you, your organization, or your application need higher rate limits, we encourage you to upgrade your account to PRO, Team, or Enterprise. We prioritize support requests from PRO, Team, and Enterprise customers – see built-in limits in [Rate limit Tiers](#rate-limit-tiers). | ||
|
|
||
| ## Billing dashboard | ||
|
|
||
| At any point, you can check your rate limit status on your (or your org’s) Billing page: https://huggingface.co/settings/billing | ||
|
|
||
|  | ||
|
|
||
| On the right side, you will see three gauges, one for each bucket of Requests. | ||
|
|
||
| Each bucket presents the number of current (last 5 minutes) requests, and the number of allowed requests based on your user account or organization plan. | ||
|
|
||
| Whenever you exceed the limit in the past 5 minutes (the view is updated in real-time), the bar will turn red. | ||
|
|
||
| Note: You can use the context switcher to easily switch between your user account and your orgs. | ||
Wauplin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## HTTP Headers | ||
|
|
||
| Whenever you or your organization hits a rate limit, you will receive a **429** `Too Many Requests` HTTP error. | ||
|
|
||
| We implement the mechanism described in the [IETF draft](https://datatracker.ietf.org/doc/draft-ietf-httpapi-ratelimit-headers/) titled “RateLimit HTTP header fields for HTTP” (also known as `draft-ietf-httpapi-ratelimit-headers`). | ||
|
|
||
| The goal is to define standardized HTTP headers that servers can use to advertise quota / rate-limit policies and communicate current usage / limits to clients so that they can avoid being throttled. | ||
|
|
||
| Precisely, we implement the following headers: | ||
|
|
||
| | Header | Purpose / Meaning | | ||
| | ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| | **`RateLimit‑Policy`** | Carries the rate limit policy itself (e.g. “100 requests per 5 minutes”). It’s informative; shows what policy the client is subject to. | | ||
| | **`RateLimit‑Limit`** | The total allowed rate limit for the current window. “How many requests (of this type) you’re allowed.” | | ||
julien-c marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| | **`RateLimit‑Remaining`** | How many requests of this type you have left in the current window. | | ||
| | **`RateLimit‑Reset`** | Number of seconds until the rate limit window resets (or until quota is refreshed). Uses a “delta-seconds” format to reduce clock sync issues. | | ||
|
|
||
| ## Rate limit Tiers | ||
|
|
||
| Here are the current rate limits (in September '25) based on your plan: | ||
|
|
||
| | Plan | API | Resolvers | Pages | | ||
| | ------------------------------------------------------------------------- | -------- | --------- | ------ | | ||
| | Anonymous user (per IP address) | 500 \* | 3,000 \* | 100 \* | | ||
| | Free user | 1,000 \* | 5,000 \* | 200 \* | | ||
| | PRO user | 2,500 | 12,000 | 400 | | ||
| | Team organization | 3,000 | 15,000 | 400 | | ||
| | Enterprise organization | 6,000 | 30,000 | 600 | | ||
| | Enterprise Plus organization | 10,000 | 50,000 | 1,000 | | ||
| | Enterprise Plus organization <br> When Organization IP Ranges are defined | 100,000 | 500,000 | 10,000 | | ||
| | Academia Hub organization | 2,500 | 12,000 | 400 | | ||
|
|
||
| \* Anonymous and Free users are subject to change over time depending on platform health 🤞 | ||
julien-c marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Note: For organizations, rate limits are applied individually to each member. | ||
|
|
||
| ## What if I get rate-limited | ||
|
|
||
| First, make sure you always pass a `HF_TOKEN`, and it is passed downstream to all libraries or applications that downloads _stuff_ from the Hub. | ||
julien-c marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| This is the number one reason users get rate limited and is a very easy fix. | ||
|
|
||
| Despite passing `HF_TOKEN` if you are still rate limited, you can: | ||
|
|
||
| - spread out your requests over longer periods of time | ||
| - replace Hub API calls with Resolver calls, whenever possible (Resolver rate limits are much higher and much more optimized). | ||
| - upgrade to PRO, Team, or Enterprise. | ||
|
|
||
| ## Granular user action Rate limits | ||
|
|
||
| In addition to those main classes of rate limits, we enforce limits on certain specific kinds of user actions, like: | ||
|
|
||
| - repo creation | ||
| - repo commits | ||
| - discussions and comments | ||
| - moderation actions | ||
| - etc. | ||
julien-c marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| We don't currently document the rate limits for those specific actions, given they tend to change over time more often. If you get quota errors, we encourage you to upgrade your account to PRO, Team, or Enterprise. | ||
| Feel free to get in touch with us via the support team. | ||
julien-c marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -242,7 +242,7 @@ This can be helpful if accessing the HTTP headers of the request is complicated | |
|
|
||
| Each Webhook is limited to 1,000 triggers per 24 hours. You can view your usage in the Webhook settings page in the "Activity" tab. | ||
|
|
||
| If you need to increase the number of triggers for your Webhook, contact us at [email protected]. | ||
| If you need to increase the number of triggers for your Webhook, upgrade to PRO, Team or Enterprise and contact us at [email protected]. | ||
julien-c marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## Developing your Webhooks | ||
|
|
||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.