Skip to content

Commit 95f0171

Browse files
Wauplinburtenshaw
andauthored
Apply suggestions from code review
Co-authored-by: burtenshaw <[email protected]>
1 parent 4d7fd47 commit 95f0171

File tree

4 files changed

+17
-17
lines changed

4 files changed

+17
-17
lines changed

docs/api-inference/hub-api.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Hub API
22

3-
The Hub provides a few API to deal with Inference Providers. Here is a list of them.
3+
The Hub provides a few APIs to interact with Inference Providers. Here is a list of them:
44

55
## List models
66

@@ -16,7 +16,7 @@ To list models powered by a provider, use the `inference_provider` query paramet
1616
...
1717
```
1818

19-
It can be combined with other filters to e.g. select only text-to-image models:
19+
It can be combined with other filters to e.g. select only `text-to-image` models:
2020

2121
```sh
2222
# List text-to-image models served by Fal AI
@@ -28,7 +28,7 @@ It can be combined with other filters to e.g. select only text-to-image models:
2828
...
2929
```
3030

31-
Pass a comma-separated list to select from multiple providers:
31+
Pass a comma-separated list of providers to select multiple:
3232

3333
```sh
3434
# List image-text-to-text models served by Novita or Sambanova
@@ -54,7 +54,7 @@ Finally, you can select all models served by at least one inference provider:
5454

5555
## Get model status
5656

57-
If you are interested by a specific model and want to check if at least 1 provider serves it, you can request the `inference` attribute in the model info endpoint:
57+
To find an inference provider for a specific model, request the `inference` attribute in the model info endpoint:
5858

5959
<inferencesnippet>
6060

@@ -170,4 +170,4 @@ In the `huggingface_hub`, use `model_info` with the expand parameter:
170170
</inferencesnippet>
171171

172172

173-
For each provider, you get the status (`staging` or `live`), the related task (here, `conversational`) and the providerId. In practice, this information is mostly relevant for the JS and Python clients. The relevant part is to know that the listed providers are the ones serving the model.
173+
Each provider serving the model shows a status (`staging` or `live`), the related task (here, `conversational`) and the providerId. In practice, this information is relevant for the JS and Python clients.

docs/api-inference/hub-integration.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
# Hub Integration
22

3-
The Inference Providers is tightly integrated with the Hugging Face Hub. No matter in which service you use it, the usage and billing will be centralized on your Hugging Face account.
3+
The Inference Providers is tightly integrated with the Hugging Face Hub. No matter which provider you use, the usage and billing will be centralized in your Hugging Face account.
44

55
## Model search
66

7-
When listing models on the Hub, you can filter to select models deployed on the inference provider for your choice. For example, to list all models deployed on Fireworks AI infra: https://huggingface.co/models?inference_provider=fireworks-ai.
7+
When listing models on the Hub, you can filter to select models deployed on the inference provider of your choice. For example, to list all models deployed on Fireworks AI infra: https://huggingface.co/models?inference_provider=fireworks-ai.
88

99
<div class="flex justify-center">
1010
<img class="block light:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/inference-providers/models-filter-by-provider-light.png"/>
1111
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/inference-providers/models-filter-by-provider-dark.png"/>
1212
</div>
1313

14-
It is also possible to select multiple providers or even all of them to filter all models that are available on at least 1 provider: https://huggingface.co/models?inference_provider=all.
14+
It is also possible to select all or multiple providers and filter their available models: https://huggingface.co/models?inference_provider=all.
1515

1616
<div class="flex justify-center">
1717
<img class="block light:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/inference-providers/models-filter-any-provider-light.png"/>
@@ -20,7 +20,7 @@ It is also possible to select multiple providers or even all of them to filter a
2020

2121
## Features using Inference Providers
2222

23-
Several Hugging Face features utilize the Inference Providers and count towards your monthly credits. The included monthly credits for PRO and Enterprise should cover moderate usage of these features for most users.
23+
Several Hugging Face features utilize Inference Providers and count towards your monthly credits. The included monthly credits for PRO and Enterprise should cover moderate usage of these features for most users.
2424

2525
- [Inference Widgets](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324): Interactive widgets available on model pages. This is the entry point to quickly test a model on the Hub.
2626

docs/api-inference/index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ To get started quickly with [Chat Completion models](http://huggingface.co/model
3535

3636
You can call the Inference Providers with your preferred tools, such as Python, JavaScript, or cURL. To simplify integration, we offer both a Python SDK (`huggingface_hub`) and a JavaScript SDK (`huggingface.js`).
3737

38-
In this section, we will demonstrate a simple example using [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324), a conversational Large Language Model. For the example, we will use [Novita AI](https://novita.ai/) as Inference Provider with routed requests. You will learn what that means in the next chapters.
38+
In this section, we will demonstrate a simple example using [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324), a conversational Large Language Model. For the example, we will use [Novita AI](https://novita.ai/) as Inference Provider.
3939

4040
### Authentication
4141

@@ -164,9 +164,9 @@ console.log(chatCompletion.choices[0].message);
164164

165165
## Next Steps
166166

167-
In this introduction, we've covered the basics of Inference Provider. To learn more about this service, check out our guides and API Reference:
167+
In this introduction, we've covered the basics of Inference Providers. To learn more about this service, check out our guides and API Reference:
168168
- [Pricing and Billing](./pricing): everything you need to know about billing
169-
- [Hub integration](./hub-integration): how the Inference Providers is integrated with the Hub?
169+
- [Hub integration](./hub-integration): how Inference Providers is integrated with the Hub?
170170
- [External Providers](./providers): everything about providers and how to become an official partner
171171
- [Hub API](./hub-api): high level API for inference providers
172172
- [API Reference](./tasks/index): learn more about the parameters and task-specific settings.

docs/api-inference/pricing.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Pricing and Billing
22

3-
The Inference Providers is a production-ready service involving external partners and is therefore a paid-product. However, as an Hugging Face user you get monthly credits to run experiments. The amount of credits you get depends on your type of account:
3+
Inference Providers is a production-ready service involving external partners and is therefore a paid-product. However, as a Hugging Face user you get monthly credits to run experiments. The amount of credits you get depends on your type of account:
44

55
| User Tier | Included monthly credits |
66
| ------------------------ | ---------------------------------- |
@@ -11,7 +11,7 @@ The Inference Providers is a production-ready service involving external partner
1111

1212
**PRO and Enterprise Hub users** can continue using the API once their monthly included credits are exhausted. This billing model, known as "Pay-as-you-Go" (PAYG), is charged on top of the monthly subscription. PAYG is only available for providers that are integrated with our billing system. We're actively working to integrate all providers, but in the meantime, any providers that are not yet integrated will be blocked once the free-tier limit is reached.
1313

14-
If you haven't used up your included credits yet, we estimate costs for providers that aren’t fully integrated with our billing system. These estimates are usually higher than the actual cost to prevent abuse, which is why PAYG is currently disabled for those providers.
14+
If you have remaining credits, we estimate costs for providers that aren’t fully integrated with our billing system. These estimates are usually higher than the actual cost to prevent abuse, which is why PAYG is currently disabled for those providers.
1515

1616
You can track your spending on your [billing page](https://huggingface.co/settings/billing).
1717

@@ -25,7 +25,7 @@ Hugging Face charges you the same rates as the provider, with no additional fees
2525

2626
The documentation above assumes you are making routed requests to external providers. In practice, there are 3 different ways to run inference, each with unique billing implications:
2727

28-
- **Routed Request**: This is the default method for using the Inference Providers. Simply use the JavaScript or Python `InferenceClient`, or make raw HTTP requests with your Hugging Face User Access Token. Your request is automatically routed through Hugging Face to the provider's platform. No separate provider account is required, and billing is managed directly by Hugging Face. This approach lets you seamlessly switch between providers without additional setup.
28+
- **Routed Request**: This is the default method for using Inference Providers. Simply use the JavaScript or Python `InferenceClient`, or make raw HTTP requests with your Hugging Face User Access Token. Your request is automatically routed through Hugging Face to the provider's platform. No separate provider account is required, and billing is managed directly by Hugging Face. This approach lets you seamlessly switch between providers without additional setup.
2929

3030
- **Routed Request with Custom Key**: In your [settings page](https://huggingface.co/settings/inference-providers) on the Hub, you can configure a custom key for each provider. To use this option, you'll need to create an account on the provider's platform, and billing will be handled directly by that provider. Hugging Face won't charge you for the call. This method gives you more control over billing when experimenting with models on the Hub. When making a routed request with a custom key, your code remains unchanged—you'll still pass your Hugging Face User Access Token. Hugging Face will automatically swap the authentication when routing the request.
3131

@@ -41,15 +41,15 @@ Here is a table that sums up what we've seen so far:
4141

4242
## HF-Inference cost
4343

44-
As you may have noticed, you can select to work with `"hf-inference"` provider. This is what used to be the "Inference API (serverless)" prior to the Inference Providers integration. From a user point of view, working with HF Inference is the same as with any other providers. Past the free-tier credits, you get charged for every inference request based on the compute time x price of the underlying hardware.
44+
As you may have noticed, you can select to work with `"hf-inference"` provider. This service used to be "Inference API (serverless)" prior to Inference Providers. From a user point of view, working with HF Inference is the same as with any other provider. Past the free-tier credits, you get charged for every inference request based on the compute time x price of the underlying hardware.
4545

4646
For instance, a request to [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) that takes 10 seconds to complete on a GPU machine that costs $0.00012 per second to run, will be billed $0.0012.
4747

4848
The `"hf-inference"` provider is currently the default provider when working with the JavaScript and Python SDKs. Note that this default might change in the future.
4949

5050
## Organization billing
5151

52-
For Enterprise Hub organizations, it is possible to centralize billing for all your users. Each user still use their own User Access Token but the requests are billed to your organization. This can be done by passing `"X-HF-Bill-To: my-org-name"` as header in your HTTP requests.
52+
For Enterprise Hub organizations, it is possible to centralize billing for all your users. Each user still uses their own User Access Token but the requests are billed to your organization. This can be done by passing `"X-HF-Bill-To: my-org-name"` as header in your HTTP requests.
5353

5454
If you are using the JavaScript `InferenceClient`, you can set the `billTo` attribute at a client level:
5555

0 commit comments

Comments
 (0)