-
Notifications
You must be signed in to change notification settings - Fork 374
update HF-Inference to inference providers only #1809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 10 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
fe88e1d
update page to inference providers only
burtenshaw 94a0018
update ToC
burtenshaw 111aec5
Update docs/hub/models-inference.md
burtenshaw bef1e90
Update docs/hub/models-inference.md
burtenshaw 391920e
Update docs/hub/models-inference.md
burtenshaw ff74e98
add datastudio
burtenshaw 2e2401e
Merge branch 'hub-docs-on-inference-providers' of https://github.com/…
burtenshaw 0ce3f92
add openai and image generation
burtenshaw 0596118
drop intro paragraph
burtenshaw 5c7625f
fix tip notes with mdx
burtenshaw 798f456
use openai client
burtenshaw File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,30 +1,142 @@ | ||
| # Inference Providers | ||
|
|
||
| Please refer to the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers) for detailed information. | ||
| Hugging Face's model pages have pay-as-you-go inference for thousands of models, so you can try them all out right in the browser. Service is powered by Inference Providers and includes a free-tier. | ||
|
|
||
| ## What is HF-Inference API? | ||
| Inference Providers give developers streamlined, unified access to hundreds of machine learning models, powered by the best serverless inference partners. 👉 **For complete documentation, visit the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers)**. | ||
|
|
||
| HF-Inference API is one of the many providers available on the Hugging Face Hub. | ||
| It is deployed by Hugging Face ourselves, using text-generation-inference for LLMs for instance. This service used to be called “Inference API (serverless)” prior to Inference Providers. | ||
| ## Inference Providers on the Hub | ||
|
|
||
| For more details about the HF-Inference API, check out its [dedicated page](https://huggingface.co/docs/inference-providers/providers/hf-inference). | ||
| Inference Providers is deeply integrated with the Hugging Face Hub, and you can use it in a few different ways: | ||
|
|
||
| ## What technology do you use to power the HF-Inference API? | ||
| - **Interactive Widgets** - Test models directly on model pages with interactive widgets that use Inference Providers under the hood. Check out the [DeepSeek-R1-0528 model page](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528) for an example. | ||
| - **Inference Playground** - Easily test and compare chat completion models with your prompts. Check out the [Inference Playground](https://huggingface.co/playground) to get started. | ||
| - **Search** - Filter models by inference provider on the [models page](https://huggingface.co/models?inference_provider=all) to find models available through specific providers. | ||
| - **Data Studio** - Use AI to explore datasets on the Hub. Check out [Data Studio](https://huggingface.co/datasets/fka/awesome-chatgpt-prompts/viewer?views%5B%5D=train) on your favorite dataset. | ||
|
|
||
| The HF-Inference API is powered by [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) under the hood. | ||
| ## Build with Inference Providers | ||
|
|
||
| ## Why don't I see an inference widget, or why can't I use the API? | ||
| You can integrate Inference Providers into your own applications using our SDKs or HTTP clients. Here's a quick start with Python and JavaScript, for more details, check out the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers). | ||
|
|
||
| For some tasks, there might not be support by any Inference Provider, and hence, there is no widget. | ||
| <hfoptions id="inference-providers-quick-start"> | ||
|
|
||
| ## How can I see my usage? | ||
| <hfoption id="python"> | ||
|
|
||
| To check usage across all providers, check out your [billing page](https://huggingface.co/settings/billing). | ||
| You can use our Python SDK to interact with Inference Providers. | ||
|
|
||
| To check your HF-Inference usage specifically, check out the [Inference Dashboard](https://ui.endpoints.huggingface.co/endpoints). The dashboard shows both your serverless and dedicated endpoints usage. | ||
| ```python | ||
| from huggingface_hub import InferenceClient | ||
burtenshaw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## Is there programmatic access to Inference Providers? | ||
| import os | ||
|
|
||
| Yes! We provide client wrappers in both JS and Python: | ||
| - [JS (`@huggingface/inference`)](https://huggingface.co/docs/huggingface.js/inference/classes/InferenceClient) | ||
| - [Python (`huggingface_hub`)](https://huggingface.co/docs/huggingface_hub/guides/inference) | ||
| client = InferenceClient( | ||
| api_key=os.environ["HF_TOKEN"], | ||
| provider="auto", # Automatically selects best provider | ||
| ) | ||
|
|
||
| # Chat completion | ||
| completion = client.chat.completions.create( | ||
| model="deepseek-ai/DeepSeek-V3-0324", | ||
| messages=[{"role": "user", "content": "A story about hiking in the mountains"}] | ||
| ) | ||
|
|
||
| # Image generation | ||
| image = client.text_to_image( | ||
| prompt="A serene lake surrounded by mountains at sunset, photorealistic style", | ||
| model="black-forest-labs/FLUX.1-dev" | ||
| ) | ||
|
|
||
| ``` | ||
|
|
||
| Or, you can just use the OpenAI API compatible client. | ||
|
|
||
| ```python | ||
burtenshaw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| import os | ||
| from huggingface_hub import InferenceClient | ||
|
|
||
| client = InferenceClient( | ||
| api_key=os.environ["HF_TOKEN"], | ||
| ) | ||
|
|
||
| completion = client.chat.completions.create( | ||
| model="deepseek-ai/DeepSeek-V3-0324", | ||
| messages=[ | ||
| { | ||
| "role": "user", | ||
| "content": "A story about hiking in the mountains" | ||
| } | ||
| ], | ||
| ) | ||
| ``` | ||
|
|
||
| <Tip warning={true}> | ||
|
|
||
| The OpenAI API compatible client is not supported for image generation. | ||
|
|
||
| </Tip> | ||
|
|
||
| </hfoption> | ||
|
|
||
| <hfoption id="javascript"> | ||
|
|
||
| You can use our JavaScript SDK to interact with Inference Providers. | ||
|
|
||
| ```javascript | ||
burtenshaw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| import { InferenceClient } from "@huggingface/inference"; | ||
|
|
||
| const client = new InferenceClient(process.env.HF_TOKEN); | ||
|
|
||
| const chatCompletion = await client.chatCompletion({ | ||
| provider: "auto", // Automatically selects best provider | ||
| model: "deepseek-ai/DeepSeek-V3-0324", | ||
| messages: [{ role: "user", content: "Hello!" }] | ||
| }); | ||
|
|
||
| const imageBlob = await client.textToImage({ | ||
| model: "black-forest-labs/FLUX.1-dev", | ||
| inputs: | ||
| "A serene lake surrounded by mountains at sunset, photorealistic style", | ||
| }); | ||
| ``` | ||
|
|
||
| Or, you can just use the OpenAI API compatible client. | ||
|
|
||
| ```javascript | ||
| import { OpenAI } from "openai"; | ||
|
|
||
| const client = new OpenAI({ | ||
| baseURL: "https://router.huggingface.co/v1", | ||
| apiKey: process.env.HF_TOKEN, | ||
| }); | ||
|
|
||
| const completion = await client.chat.completions.create({ | ||
| model: "meta-llama/Llama-3.1-8B-Instruct", | ||
| messages: [{ role: "user", content: "A story about hiking in the mountains" }], | ||
| }); | ||
|
|
||
| ``` | ||
|
|
||
| <Tip warning={true}> | ||
|
|
||
| The OpenAI API compatible client is not supported for image generation. | ||
|
|
||
| </Tip> | ||
|
|
||
| </hfoption> | ||
|
|
||
| </hfoptions> | ||
|
|
||
burtenshaw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| You'll need a Hugging Face token with inference permissions. Create one at [Settings > Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). | ||
|
|
||
| ### How Inference Providers works | ||
|
|
||
| To dive deeper into Inference Providers, check out the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers). Here are some key resources: | ||
|
|
||
| - **[Quick Start](https://huggingface.co/docs/inference-providers)** | ||
| - **[Pricing & Billing Guide](https://huggingface.co/docs/inference-providers/pricing)** | ||
| - **[Hub Integration Details](https://huggingface.co/docs/inference-providers/hub-integration)** | ||
|
|
||
| ### What was the HF-Inference API? | ||
|
|
||
| HF-Inference API is one of the providers available through Inference Providers. It was previously called "Inference API (serverless)" and is powered by [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) under the hood. | ||
|
|
||
| For more details about the HF-Inference provider specifically, check out its [dedicated page](https://huggingface.co/docs/inference-providers/providers/hf-inference). | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.