diff --git a/docs/hub/_toctree.yml b/docs/hub/_toctree.yml index 16fcf1ad6..5ba9ead00 100644 --- a/docs/hub/_toctree.yml +++ b/docs/hub/_toctree.yml @@ -127,7 +127,7 @@ - local: models-widgets-examples title: Widget Examples - local: models-inference - title: Inference API docs + title: Model Inference - local: models-download-stats title: Models Download Stats - local: models-faq diff --git a/docs/hub/models-inference.md b/docs/hub/models-inference.md index 4a8f81465..a6d85060e 100644 --- a/docs/hub/models-inference.md +++ b/docs/hub/models-inference.md @@ -1,30 +1,143 @@ # Inference Providers -Please refer to the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers) for detailed information. +Hugging Face's model pages have pay-as-you-go inference for thousands of models, so you can try them all out right in the browser. Service is powered by Inference Providers and includes a free-tier. -## What is HF-Inference API? +Inference Providers give developers streamlined, unified access to hundreds of machine learning models, powered by the best serverless inference partners. 👉 **For complete documentation, visit the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers)**. -HF-Inference API is one of the many providers available on the Hugging Face Hub. -It is deployed by Hugging Face ourselves, using text-generation-inference for LLMs for instance. This service used to be called “Inference API (serverless)” prior to Inference Providers. +## Inference Providers on the Hub -For more details about the HF-Inference API, check out its [dedicated page](https://huggingface.co/docs/inference-providers/providers/hf-inference). +Inference Providers is deeply integrated with the Hugging Face Hub, and you can use it in a few different ways: -## What technology do you use to power the HF-Inference API? +- **Interactive Widgets** - Test models directly on model pages with interactive widgets that use Inference Providers under the hood. Check out the [DeepSeek-R1-0528 model page](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528) for an example. +- **Inference Playground** - Easily test and compare chat completion models with your prompts. Check out the [Inference Playground](https://huggingface.co/playground) to get started. +- **Search** - Filter models by inference provider on the [models page](https://huggingface.co/models?inference_provider=all) to find models available through specific providers. +- **Data Studio** - Use AI to explore datasets on the Hub. Check out [Data Studio](https://huggingface.co/datasets/fka/awesome-chatgpt-prompts/viewer?views%5B%5D=train) on your favorite dataset. -The HF-Inference API is powered by [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) under the hood. +## Build with Inference Providers -## Why don't I see an inference widget, or why can't I use the API? +You can integrate Inference Providers into your own applications using our SDKs or HTTP clients. Here's a quick start with Python and JavaScript, for more details, check out the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers). -For some tasks, there might not be support by any Inference Provider, and hence, there is no widget. + -## How can I see my usage? + -To check usage across all providers, check out your [billing page](https://huggingface.co/settings/billing). +You can use our Python SDK to interact with Inference Providers. -To check your HF-Inference usage specifically, check out the [Inference Dashboard](https://ui.endpoints.huggingface.co/endpoints). The dashboard shows both your serverless and dedicated endpoints usage. +```python +from huggingface_hub import InferenceClient -## Is there programmatic access to Inference Providers? +import os -Yes! We provide client wrappers in both JS and Python: -- [JS (`@huggingface/inference`)](https://huggingface.co/docs/huggingface.js/inference/classes/InferenceClient) -- [Python (`huggingface_hub`)](https://huggingface.co/docs/huggingface_hub/guides/inference) +client = InferenceClient( + api_key=os.environ["HF_TOKEN"], + provider="auto", # Automatically selects best provider +) + +# Chat completion +completion = client.chat.completions.create( + model="deepseek-ai/DeepSeek-V3-0324", + messages=[{"role": "user", "content": "A story about hiking in the mountains"}] +) + +# Image generation +image = client.text_to_image( + prompt="A serene lake surrounded by mountains at sunset, photorealistic style", + model="black-forest-labs/FLUX.1-dev" +) + +``` + +Or, you can just use the OpenAI API compatible client. + +```python +import os +from openai import OpenAI + +client = OpenAI( + base_url="https://router.huggingface.co/v1", + api_key=os.environ["HF_TOKEN"], +) + +completion = client.chat.completions.create( + model="deepseek-ai/DeepSeek-V3-0324", + messages=[ + { + "role": "user", + "content": "A story about hiking in the mountains" + } + ], +) +``` + + + +The OpenAI API compatible client is not supported for image generation. + + + + + + + +You can use our JavaScript SDK to interact with Inference Providers. + +```javascript +import { InferenceClient } from "@huggingface/inference"; + +const client = new InferenceClient(process.env.HF_TOKEN); + +const chatCompletion = await client.chatCompletion({ + provider: "auto", // Automatically selects best provider + model: "deepseek-ai/DeepSeek-V3-0324", + messages: [{ role: "user", content: "Hello!" }] +}); + +const imageBlob = await client.textToImage({ + model: "black-forest-labs/FLUX.1-dev", + inputs: + "A serene lake surrounded by mountains at sunset, photorealistic style", +}); +``` + +Or, you can just use the OpenAI API compatible client. + +```javascript +import { OpenAI } from "openai"; + +const client = new OpenAI({ + baseURL: "https://router.huggingface.co/v1", + apiKey: process.env.HF_TOKEN, +}); + +const completion = await client.chat.completions.create({ + model: "meta-llama/Llama-3.1-8B-Instruct", + messages: [{ role: "user", content: "A story about hiking in the mountains" }], +}); + +``` + + + +The OpenAI API compatible client is not supported for image generation. + + + + + + + +You'll need a Hugging Face token with inference permissions. Create one at [Settings > Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). + +### How Inference Providers works + +To dive deeper into Inference Providers, check out the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers). Here are some key resources: + +- **[Quick Start](https://huggingface.co/docs/inference-providers)** +- **[Pricing & Billing Guide](https://huggingface.co/docs/inference-providers/pricing)** +- **[Hub Integration Details](https://huggingface.co/docs/inference-providers/hub-integration)** + +### What was the HF-Inference API? + +HF-Inference API is one of the providers available through Inference Providers. It was previously called "Inference API (serverless)" and is powered by [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) under the hood. + +For more details about the HF-Inference provider specifically, check out its [dedicated page](https://huggingface.co/docs/inference-providers/providers/hf-inference).