Skip to content

Commit 09f1bc5

Browse files
committed
(vibe-coded) wording pass
1 parent 6a91bab commit 09f1bc5

File tree

1 file changed

+46
-48
lines changed

1 file changed

+46
-48
lines changed

docs/inference-providers/index.md

Lines changed: 46 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ To learn more about the launch of Inference Providers, check out our [announceme
1111

1212
## Partners
1313

14-
Here is the complete list of partners integrated with Inference Providers, and the supported tasks for each of them:
14+
Our platform integrates with leading AI infrastructure providers, giving you access to their specialized capabilities through a single, consistent API. Here's what each partner supports:
1515

1616
| Provider | Chat completion (LLM) | Chat completion (VLM) | Feature Extraction | Text to Image | Text to video |
1717
| -------------------------------------------- | :-------------------: | :-------------------: | :----------------: | :-----------: | :-----------: |
@@ -30,16 +30,24 @@ Here is the complete list of partners integrated with Inference Providers, and t
3030
| [SambaNova](./providers/sambanova) || || | |
3131
| [Together](./providers/together) ||| || |
3232

33-
## Why use Inference Providers?
33+
## Why Choose Inference Providers?
3434

35-
Inference Providers offers a fast and simple way to explore thousands of models for a variety of tasks. Whether you're experimenting with ML capabilities or building a new application, this API gives you instant access to high-performing models across multiple domains:
35+
If you're building AI-powered applications, you've likely experienced the pain points of managing multiple provider APIs, comparing model performance, and dealing with varying reliability. Inference Providers solves these challenges by offering:
3636

37-
- **Text Generation:** Including large language models and tool-calling prompts, generate and experiment with high-quality responses.
38-
- **Image and Video Generation:** Easily create customized images, including LoRAs for your own styles.
39-
- **Document Embeddings:** Build search and retrieval systems with SOTA embeddings.
40-
- **Classical AI Tasks:** Ready-to-use models for text classification, image classification, speech recognition, and more.
37+
**Instant Access to Cutting-Edge Models**: Go beyond mainstream providers to access thousands of specialized models across multiple AI tasks. Whether you need the latest language models, state-of-the-art image generators, or domain-specific embeddings, you'll find them here.
4138

42-
**Fast and Free to Get Started**: Inference Providers comes with a free-tier and additional included credits for [PRO users](https://hf.co/subscribe/pro), as well as [Enterprise Hub organizations](https://huggingface.co/enterprise).
39+
**Zero Vendor Lock-in**: Unlike being tied to a single provider's model catalog, you get access to models from Cerebras, Groq, Together AI, Replicate, and more — all through one consistent interface.
40+
41+
**Production-Ready Performance**: Built for enterprise workloads with automatic failover, intelligent routing, and the reliability your applications demand.
42+
43+
Here's what you can build:
44+
45+
- **Text Generation**: Use Large language models with tool-calling capabilities for chatbots, content generation, and code assistance
46+
- **Image and Video Generation**: Create custom images and videos, including support for LoRAs and style customization
47+
- **Search & Retrieval**: State-of-the-art embeddings for semantic search, RAG systems, and recommendation engines
48+
- **Traditional ML Tasks**: Ready-to-use models for classification, NER, summarization, and speech recognition
49+
50+
**Get Started for Free**: Inference Providers includes a generous free tier, with additional credits for [PRO users](https://hf.co/subscribe/pro) and [Enterprise Hub organizations](https://huggingface.co/enterprise).
4351

4452
## Key Features
4553

@@ -50,40 +58,39 @@ Inference Providers offers a fast and simple way to explore thousands of models
5058
- **👷 Easy to integrate**: Drop-in replacement for the OpenAI chat completions API.
5159
- **💰 Cost-Effective**: No extra markup on provider rates.
5260

53-
## Get Started
61+
## Getting Started
5462

55-
You can use Inference Providers with your preferred tools, such as Python, JavaScript, or cURL. To simplify integration, we offer both a Python SDK (`huggingface_hub`) and a JavaScript SDK (`huggingface.js`).
63+
Inference Providers works with your existing development workflow. Whether you prefer Python, JavaScript, or direct HTTP calls, we provide native SDKs and OpenAI-compatible APIs to get you up and running quickly.
5664

57-
In this section, we will demonstrate a simple example using [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324), a conversational Large Language Model.
65+
We'll walk through a practical example using [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324), a state-of-the-art open-weights conversational model.
5866

5967
### Inference Playground
6068

61-
To get started quickly with [Chat Completion models](http://huggingface.co/models?inference_provider=all&sort=trending&other=conversational), use the [Inference Playground](https://huggingface.co/playground) to easily test and compare models with your prompts.
69+
Before diving into integration, explore models interactively with our [Inference Playground](https://huggingface.co/playground). Test different [chat completion models](http://huggingface.co/models?inference_provider=all&sort=trending&other=conversational) with your prompts and compare responses to find the perfect fit for your use case.
6270

6371
<a href="https://huggingface.co/playground" target="blank"><img src="https://cdn-uploads.huggingface.co/production/uploads/5f17f0a0925b9863e28ad517/9_Tgf0Tv65srhBirZQMTp.png" style="max-width: 550px; width: 100%;"/></a>
6472

6573
### Authentication
6674

67-
Inference Providers requires passing a user token in the request headers. You can generate a token by signing up on the Hugging Face website and going to the [settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). We recommend creating a `fine-grained` token with the scope to `Make calls to Inference Providers`.
75+
You'll need a Hugging Face token to authenticate your requests. Create one by visiting your [token settings](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained) and generating a `fine-grained` token with `Make calls to Inference Providers` permissions.
6876

69-
For more details about user tokens, check out [this guide](https://huggingface.co/docs/hub/en/security-tokens).
77+
For complete token management details, see our [security tokens guide](https://huggingface.co/docs/hub/en/security-tokens).
7078

7179
### Quick Start - LLM
7280

7381
TODO : add blurb explaining what we're doing here (quick inference with LLM and chat completions)
7482

7583
#### Python
7684

77-
This section explains how to use the Inference Providers API to run inference requests with [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324) in Python.
85+
Here are three ways to integrate Inference Providers into your Python applications, from high-level convenience to low-level control:
7886

7987
<hfoptions id="python-clients">
8088

8189
<hfoption id="huggingface_hub">
8290

83-
For convenience, the Python library `huggingface_hub` provides an [`InferenceClient`](https://huggingface.co/docs/huggingface_hub/guides/inference) that handles inference for you.
84-
The most suitable provider is automatically selected by the client library.
91+
For convenience, the `huggingface_hub` library provides an [`InferenceClient`](https://huggingface.co/docs/huggingface_hub/guides/inference) that automatically handles provider selection and request routing.
8592

86-
Make sure to install it with `pip install huggingface_hub`.
93+
Install with `pip install huggingface_hub`:
8794

8895
```python
8996
import os
@@ -110,10 +117,9 @@ print(completion.choices[0].message)
110117

111118
<hfoption id="openai">
112119

113-
The Inference Providers API can be used as a drop-in replacement for the OpenAI API (or any chat completions compatible API) for your preferred client.
114-
Just replace the chat completion base URL with `https://router.huggingface.co/v1`.
115-
The most suitable provider for the model is automatically selected by the hugging face server.
116-
For example, with the OpenAI Python client:
120+
**Drop-in OpenAI Replacement**: Already using OpenAI's Python client? Just change the base URL to instantly access hundreds of additional open-weights models through our provider network.
121+
122+
Our system automatically routes your request to the optimal provider for the specified model:
117123

118124
```python
119125
import os
@@ -125,7 +131,7 @@ client = OpenAI(
125131
)
126132

127133
completion = client.chat.completions.create(
128-
model="deepseek-ai/DeepSeek-V3-024",
134+
model="deepseek-ai/DeepSeek-V3-0324",
129135
messages=[
130136
{
131137
"role": "user",
@@ -141,9 +147,9 @@ print(completion.choices[0].message)
141147

142148
<hfoption id="requests">
143149

144-
If you would rather implement a lower-level integration, you can request the Inference Provider API with HTTP.
145-
The Inference Providers API will automatically select the most suitable provider for the requested model.
146-
For example with the `requests` library:
150+
**Direct HTTP Integration**: For maximum control or integration with custom frameworks, use our OpenAI-compatible REST API directly.
151+
152+
Our routing system automatically selects the best available provider for your chosen model:
147153

148154
```python
149155
import os
@@ -158,7 +164,7 @@ payload = {
158164
"content": "How many 'G's in 'huggingface'?"
159165
}
160166
],
161-
"model": "deepseek/deepseek-v3-0324",
167+
"model": "deepseek-ai/DeepSeek-V3-0324",
162168
}
163169

164170
response = requests.post(API_URL, headers=headers, json=payload)
@@ -171,16 +177,15 @@ print(response.json()["choices"][0]["message"])
171177

172178
#### JavaScript
173179

174-
This section explains how to use the Inference Providers API to run inference requests with [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324) in Javascript.
180+
Integrate Inference Providers into your JavaScript applications with these flexible approaches:
175181

176182
<hfoptions id="javascript-clients">
177183

178184
<hfoption id="huggingface.js">
179185

180-
For convenience, the JS library `@huggingface/inference` provides an [`InferenceClient`](https://huggingface.co/docs/huggingface.js/inference/classes/InferenceClient) that handles inference for you.
181-
The most suitable provider is automatically selected by the client library.
186+
Our JavaScript SDK provides a convenient interface with automatic provider selection and TypeScript support.
182187

183-
You can install it with `npm install @huggingface/inference`.
188+
Install with `npm install @huggingface/inference`:
184189

185190
```js
186191
import { InferenceClient } from "@huggingface/inference";
@@ -204,10 +209,7 @@ console.log(chatCompletion.choices[0].message);
204209

205210
<hfoption id="openai">
206211

207-
The Inference Providers API can be used as a drop-in replacement for the OpenAI API (or any chat completions compatible API) for your preferred client.
208-
Just replace the chat completion base URL with `https://router.huggingface.co/v1`.
209-
The most suitable provider for the model is automatically selected by the hugging face server.
210-
For example, with the OpenAI JS client:
212+
**OpenAI JavaScript Client Compatible**: Migrate your existing OpenAI integration seamlessly by updating just the base URL:
211213

212214
```javascript
213215
import OpenAI from "openai";
@@ -234,9 +236,7 @@ console.log(completion.choices[0].message.content);
234236

235237
<hfoption id="fetch">
236238

237-
If you would rather implement a lower-level integration, you can request the Inference Provider API with HTTP.
238-
The Inference Providers API will automatically select the most suitable provider for the requested model.
239-
For example, using `fetch`:
239+
**Native Fetch Integration**: For lightweight applications or custom implementations, use our REST API directly with standard fetch:
240240

241241
```js
242242
import fetch from "node-fetch";
@@ -269,9 +269,7 @@ console.log(await response.json());
269269

270270
#### HTTP / cURL
271271

272-
The following cURL command highlighting the raw HTTP request. You can adapt this request to be run with the tool of your choice.
273-
274-
The most suitable provider for the requested model will be automatically selected by the server.
272+
For testing, debugging, or integrating with any HTTP client, here's the raw REST API format. Our intelligent routing automatically selects the optimal provider for your requested model:
275273

276274
```bash
277275
curl https://router.huggingface.co/v1/chat/completions \
@@ -284,7 +282,7 @@ curl https://router.huggingface.co/v1/chat/completions \
284282
"content": "How many G in huggingface?"
285283
}
286284
],
287-
"model": "deepseek/deepseek-v3-0324",
285+
"model": "deepseek-ai/DeepSeek-V3-0324",
288286
"stream": false
289287
}'
290288
```
@@ -309,10 +307,10 @@ TODO: explain implementation details? (no URL rewrite, just proxy)
309307

310308
## Next Steps
311309

312-
In this introduction, we've covered the basics of Inference Providers. To learn more about this service, check out our guides and API Reference:
310+
Now that you understand the basics, explore these resources to make the most of Inference Providers:
313311

314-
- [Pricing and Billing](./pricing): everything you need to know about billing.
315-
- [Hub integration](./hub-integration): how is Inference Providers integrated with the Hub?
316-
- [Register as an Inference Provider](./register-as-a-provider): everything about how to become an official partner.
317-
- [Hub API](./hub-api): high-level API for Inference Providers.
318-
- [API Reference](./tasks/index): learn more about the parameters and task-specific settings.
312+
- **[Pricing and Billing](./pricing)**: Understand costs and billing of Inference Providers
313+
- **[Hub Integration](./hub-integration)**: Learn how Inference Providers are integrated with the Hugging Face Hub
314+
- **[Register as a Provider](./register-as-a-provider)**: Requirements to join our partner network as a provider
315+
- **[Hub API](./hub-api)**: Advanced API features and configuration
316+
- **[API Reference](./tasks/index)**: Complete parameter documentation for all supported tasks

0 commit comments

Comments
 (0)