huggingface · burtenshaw · Jul 4, 2025 · Jul 3, 2025 · Jul 3, 2025 · Jul 3, 2025
diff --git a/docs/hub/_toctree.yml b/docs/hub/_toctree.yml
@@ -127,7 +127,7 @@
       - local: models-widgets-examples
         title: Widget Examples
   - local: models-inference
-    title: Inference API docs
+    title: Model Inference
   - local: models-download-stats
     title: Models Download Stats
   - local: models-faq

diff --git a/docs/hub/models-inference.md b/docs/hub/models-inference.md
@@ -1,30 +1,142 @@
 # Inference Providers
 
-Please refer to the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers) for detailed information.
+Hugging Face's model pages have pay-as-you-go inference for thousands of models, so you can try them all out right in the browser. Service is powered by Inference Providers and includes a free-tier.
 
-## What is HF-Inference API?
+Inference Providers give developers streamlined, unified access to hundreds of machine learning models, powered by the best serverless inference partners. 👉 **For complete documentation, visit the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers)**.
 
-HF-Inference API is one of the many providers available on the Hugging Face Hub.
-It is deployed by Hugging Face ourselves, using text-generation-inference for LLMs for instance. This service used to be called “Inference API (serverless)” prior to Inference Providers. 
+## Inference Providers on the Hub
 
-For more details about the HF-Inference API, check out its [dedicated page](https://huggingface.co/docs/inference-providers/providers/hf-inference).
+Inference Providers is deeply integrated with the Hugging Face Hub, and you can use it in a few different ways:
 
-## What technology do you use to power the HF-Inference API?
+- **Interactive Widgets** - Test models directly on model pages with interactive widgets that use Inference Providers under the hood. Check out the [DeepSeek-R1-0528 model page](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528) for an example.
+- **Inference Playground** - Easily test and compare chat completion models with your prompts. Check out the [Inference Playground](https://huggingface.co/playground) to get started.
+- **Search** - Filter models by inference provider on the [models page](https://huggingface.co/models?inference_provider=all) to find models available through specific providers.
+- **Data Studio** - Use AI to explore datasets on the Hub. Check out [Data Studio](https://huggingface.co/datasets/fka/awesome-chatgpt-prompts/viewer?views%5B%5D=train) on your favorite dataset.
 
-The HF-Inference API is powered by [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) under the hood.
+## Build with Inference Providers
 
-## Why don't I see an inference widget, or why can't I use the API?
+You can integrate Inference Providers into your own applications using our SDKs or HTTP clients. Here's a quick start with Python and JavaScript, for more details, check out the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers).
 
-For some tasks, there might not be support by any Inference Provider, and hence, there is no widget.
+<hfoptions id="inference-providers-quick-start">
 
-## How can I see my usage?
+<hfoption id="python">
 
-To check usage across all providers, check out your [billing page](https://huggingface.co/settings/billing).
+You can use our Python SDK to interact with Inference Providers.
 
-To check your HF-Inference usage specifically, check out the [Inference Dashboard](https://ui.endpoints.huggingface.co/endpoints). The dashboard shows both your serverless and dedicated endpoints usage.
+```python
+from huggingface_hub import InferenceClient
 
-## Is there programmatic access to Inference Providers?
+import os
 
-Yes! We provide client wrappers in both JS and Python:
-- [JS (`@huggingface/inference`)](https://huggingface.co/docs/huggingface.js/inference/classes/InferenceClient)
-- [Python (`huggingface_hub`)](https://huggingface.co/docs/huggingface_hub/guides/inference)
+client = InferenceClient(
+    api_key=os.environ["HF_TOKEN"],
+    provider="auto",   # Automatically selects best provider
+)
+
+# Chat completion
+completion = client.chat.completions.create(
+    model="deepseek-ai/DeepSeek-V3-0324",
+    messages=[{"role": "user", "content": "A story about hiking in the mountains"}]
+)
+
+# Image generation
+image = client.text_to_image(
+    prompt="A serene lake surrounded by mountains at sunset, photorealistic style",
+    model="black-forest-labs/FLUX.1-dev"
+)
+
+```
+
+Or, you can just use the OpenAI API compatible client.
+
+```python
+import os
+from huggingface_hub import InferenceClient
+
+client = InferenceClient(
+    api_key=os.environ["HF_TOKEN"],
+)
+
+completion = client.chat.completions.create(
+    model="deepseek-ai/DeepSeek-V3-0324",
+    messages=[
+        {
+            "role": "user",
+            "content": "A story about hiking in the mountains"
+        }
+    ],
+)
+```
+
+<Tip warning={true}>
+
+The OpenAI API compatible client is not supported for image generation.
+
+</Tip>
+
+</hfoption>
+
+<hfoption id="javascript">
+
+You can use our JavaScript SDK to interact with Inference Providers.
+
+```javascript
+import { InferenceClient } from "@huggingface/inference";
+
+const client = new InferenceClient(process.env.HF_TOKEN);
+
+const chatCompletion = await client.chatCompletion({
+    provider: "auto",  // Automatically selects best provider  
+    model: "deepseek-ai/DeepSeek-V3-0324",
+    messages: [{ role: "user", content: "Hello!" }]
+});
+
+const imageBlob = await client.textToImage({
+  model: "black-forest-labs/FLUX.1-dev",
+  inputs:
+    "A serene lake surrounded by mountains at sunset, photorealistic style",
+});
+```
+
+Or, you can just use the OpenAI API compatible client.
+
+```javascript
+import { OpenAI } from "openai";
+
+const client = new OpenAI({
+  baseURL: "https://router.huggingface.co/v1",
+  apiKey: process.env.HF_TOKEN,
+});
+
+const completion = await client.chat.completions.create({
+  model: "meta-llama/Llama-3.1-8B-Instruct",
+  messages: [{ role: "user", content: "A story about hiking in the mountains" }],
+});
+
+```
+
+<Tip warning={true}>
+
+The OpenAI API compatible client is not supported for image generation.
+
+</Tip>
+
+</hfoption>
+
+</hfoptions>
+
+You'll need a Hugging Face token with inference permissions. Create one at [Settings > Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained).
+
+### How Inference Providers works
+
+To dive deeper into Inference Providers, check out the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers). Here are some key resources:
+
+- **[Quick Start](https://huggingface.co/docs/inference-providers)** 
+- **[Pricing & Billing Guide](https://huggingface.co/docs/inference-providers/pricing)**
+- **[Hub Integration Details](https://huggingface.co/docs/inference-providers/hub-integration)**
+
+### What was the HF-Inference API?
+
+HF-Inference API is one of the providers available through Inference Providers. It was previously called "Inference API (serverless)" and is powered by [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) under the hood.
+
+For more details about the HF-Inference provider specifically, check out its [dedicated page](https://huggingface.co/docs/inference-providers/providers/hf-inference).