-
Notifications
You must be signed in to change notification settings - Fork 374
update HF-Inference to inference providers only #1809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
fe88e1d
update page to inference providers only
burtenshaw 94a0018
update ToC
burtenshaw 111aec5
Update docs/hub/models-inference.md
burtenshaw bef1e90
Update docs/hub/models-inference.md
burtenshaw 391920e
Update docs/hub/models-inference.md
burtenshaw ff74e98
add datastudio
burtenshaw 2e2401e
Merge branch 'hub-docs-on-inference-providers' of https://github.com/…
burtenshaw 0ce3f92
add openai and image generation
burtenshaw 0596118
drop intro paragraph
burtenshaw 5c7625f
fix tip notes with mdx
burtenshaw 798f456
use openai client
burtenshaw File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,30 +1,76 @@ | ||
| # Inference Providers | ||
|
|
||
| Please refer to the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers) for detailed information. | ||
| Hugging Face's model pages have free inference for thousands of models, so you can try them all out right in the browser. It's also powered by Inference Providers. | ||
|
|
||
| ## What is HF-Inference API? | ||
| Inference Providers give developers streamlined, unified access to hundreds of machine learning models, powered by the best serverless inference partners. 👉 **For complete documentation, visit the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers)**. | ||
|
|
||
| HF-Inference API is one of the many providers available on the Hugging Face Hub. | ||
| It is deployed by Hugging Face ourselves, using text-generation-inference for LLMs for instance. This service used to be called “Inference API (serverless)” prior to Inference Providers. | ||
| ## Inference Providers on the Hub | ||
|
|
||
| For more details about the HF-Inference API, check out its [dedicated page](https://huggingface.co/docs/inference-providers/providers/hf-inference). | ||
| Inference Providers is deeply integrated with the Hugging Face Hub, and you can use it in a few different ways: | ||
|
|
||
| ## What technology do you use to power the HF-Inference API? | ||
| - **Interactive Widgets** - Test models directly on model pages with interactive widgets that use Inference Providers under the hood. Check out the [DeepSeek-R1-0528 model page](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528) for an example. | ||
| - **Inference Playground** - Easily test and compare chat completion models with your prompts. Check out the [Inference Playground](https://huggingface.co/playground) to get started. | ||
| - **Search** - Filter models by inference provider on the [models page](https://huggingface.co/models?inference_provider=all) to find models available through specific providers. | ||
|
|
||
burtenshaw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| The HF-Inference API is powered by [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) under the hood. | ||
| ## Build with Inference Providers | ||
|
|
||
| ## Why don't I see an inference widget, or why can't I use the API? | ||
| You can integrate Inference Providers into your own applications using our SDKs or HTTP clients. Here's a quick start with Python and JavaScript, for more details, check out the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers). | ||
|
|
||
| For some tasks, there might not be support by any Inference Provider, and hence, there is no widget. | ||
| <hfoptions id="inference-providers-quick-start"> | ||
|
|
||
| ## How can I see my usage? | ||
| <hfoption id="python"> | ||
|
|
||
| To check usage across all providers, check out your [billing page](https://huggingface.co/settings/billing). | ||
| ```python | ||
| from huggingface_hub import InferenceClient | ||
burtenshaw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| To check your HF-Inference usage specifically, check out the [Inference Dashboard](https://ui.endpoints.huggingface.co/endpoints). The dashboard shows both your serverless and dedicated endpoints usage. | ||
| import os | ||
|
|
||
| ## Is there programmatic access to Inference Providers? | ||
| client = InferenceClient( | ||
| api_key=os.environ["HF_TOKEN"], | ||
| provider="auto", # Automatically selects best provider | ||
| ) | ||
|
|
||
| Yes! We provide client wrappers in both JS and Python: | ||
| - [JS (`@huggingface/inference`)](https://huggingface.co/docs/huggingface.js/inference/classes/InferenceClient) | ||
| - [Python (`huggingface_hub`)](https://huggingface.co/docs/huggingface_hub/guides/inference) | ||
| # Chat completion | ||
| completion = client.chat.completions.create( | ||
| model="deepseek-ai/DeepSeek-V3-0324", | ||
| messages=[{"role": "user", "content": "Hello!"}] | ||
| ) | ||
| ``` | ||
|
|
||
| </hfoption> | ||
|
|
||
| <hfoption id="javascript"> | ||
|
|
||
| ```javascript | ||
burtenshaw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| import { InferenceClient } from "@huggingface/inference"; | ||
|
|
||
| const client = new InferenceClient(process.env.HF_TOKEN); | ||
|
|
||
| const chatCompletion = await client.chatCompletion({ | ||
| provider: "auto", // Automatically selects best provider | ||
| model: "deepseek-ai/DeepSeek-V3-0324", | ||
| messages: [{ role: "user", content: "Hello!" }] | ||
| }); | ||
| ``` | ||
|
|
||
| </hfoption> | ||
|
|
||
| </hfoptions> | ||
|
|
||
burtenshaw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| You'll need a Hugging Face token with inference permissions. Create one at [Settings > Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). | ||
|
|
||
| ### How Inference Providers works | ||
|
|
||
| Hugging Face’s Inference Providers give developers unified access to hundreds of machine learning models, powered by our serverless inference partners. This new approach builds on our previous Serverless Inference API, offering more models, improved performance, and greater reliability thanks to world-class providers. | ||
burtenshaw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| To dive deeper into Inference Providers, check out the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers). Here are some key resources: | ||
|
|
||
| - **[Quick Start](https://huggingface.co/docs/inference-providers)** | ||
| - **[Pricing & Billing Guide](https://huggingface.co/docs/inference-providers/pricing)** | ||
| - **[Hub Integration Details](https://huggingface.co/docs/inference-providers/hub-integration)** | ||
|
|
||
| ### What was the HF-Inference API? | ||
|
|
||
| HF-Inference API is one of the providers available through Inference Providers. It was previously called "Inference API (serverless)" and is powered by [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) under the hood. | ||
|
|
||
| For more details about the HF-Inference provider specifically, check out its [dedicated page](https://huggingface.co/docs/inference-providers/providers/hf-inference). | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.