From 588daf9c643be4b48cc2cedbaaba22d5e746502a Mon Sep 17 00:00:00 2001
From: burtenshaw <ben.burtenshaw@gmail.com>
Date: Thu, 26 Jun 2025 14:20:41 +0200
Subject: [PATCH 1/6] add more detail to the billing page

---
 docs/inference-providers/pricing.md | 55 ++++++++++++++++++++++-------
 1 file changed, 43 insertions(+), 12 deletions(-)
diff --git a/docs/inference-providers/pricing.md b/docs/inference-providers/pricing.md
index 3199c74bc..bcc853371 100644
--- a/docs/inference-providers/pricing.md
+++ b/docs/inference-providers/pricing.md
@@ -1,31 +1,62 @@
 # Pricing and Billing
 
-Inference Providers is a production-ready service involving external partners and is therefore a paid product. However, as a Hugging Face user, you get monthly credits to run experiments. The amount of credits you get depends on your type of account:
+Access 200+ models from leading AI inference providers with centralized, transparent, pay-as-you-go pricing. No infrastructure management required—just pay for what you use, with no markup from Hugging Face.
 
-| Tier                         | Included monthly credits             |
-| ---------------------------- | ------------------------------------ |
-| Free Users                   | subject to change, less than $0.10   |
-| PRO Users                    | $2.00                                |
-| Enterprise Hub Organizations | $2.00 per seat, shared among members |
+## Free Credits to Get Started
+
+Every Hugging Face user receives monthly credits to experiment with Inference Providers:
+
+| Account Type                 | Monthly Credits | Value    |
+| ---------------------------- | --------------- | -------- |
+| Free Users                   | Limited credits | ~$0.10   |
+| PRO Users                    | Full access     | **$2.00** |
+| Enterprise Hub Organizations | Per-seat access | **$2.00 per seat** |
+
+<Tip>
+
+Your monthly credits automatically apply when you route requests through Hugging Face. For Enterprise organizations, credits are shared among all members.
+
+</Tip>
+
+## How Billing Works: Choose Your Approach
+
+Inference Providers offers flexibility in how you're billed. Understanding these options upfront helps you choose the best approach for your needs:
+
+| Feature | **Routed by Hugging Face** | **Custom Provider Key** | **Direct Calls** |
+| :--- | :--- | :--- | :--- |
+| **How it Works** | Your request routes through HF to the provider | You set a custom provider key in HF settings | You provide the provider key directly in your code |
+| **Billing** | Pay-as-you-go on your HF account | Billed directly by the provider | Billed directly by the provider |
+| **Monthly Credits** | **✅ Yes** - Credits apply to eligible providers | **❌ No** - Credits don't apply | **❌ No** - Credits don't apply |
+| **Provider Account Needed** | **❌ No** - We handle everything | **✅ Yes** - You need provider accounts | **✅ Yes** - You need provider accounts |
+| **Best For** | Simplicity, experimentation, consolidated billing | More billing control, using non-integrated providers | Full control, bypassing HF routing |
+| **Integration** | SDKs, Playground, widgets, Data AI Studio | SDKs, Playground, widgets, Data AI Studio | SDKs only |
+
+### Which Option Should I Choose?
+
+- **Start with Routed by Hugging Face** if you want simplicity and to use your monthly credits
+- **Use Custom Provider Key** if you need specific provider features or you're consistently using the same provider
+- **Use Direct Calls** if you want to bypass Hugging Face routing entirely
+
+## Pay-as-you-Go Details
 
 To benefit from Enterprise Hub included credits, you need to explicitly specify the organization to be billed when performing the inference requests.
 See the [Organization Billing section](#organization-billing) below for more details.
 
-## Pay-as-you-Go
+**PRO users and Enterprise Hub organizations** can continue using the API after exhausting their monthly credits. This ensures uninterrupted access to models for production workloads.
 
-**PRO users and Enterprise Hub organizations** can continue using the API once their monthly included credits are exhausted. This billing model, known as "Pay-as-you-Go" (PAYG), is charged on top of the monthly subscription. PAYG is only available for providers that are integrated with our billing system. We're actively working to integrate all providers, but in the meantime, any providers that are not yet integrated will be blocked once the free-tier limit is reached.
+**Current Status**: Pay-as-you-Go is available for providers integrated with our billing system. We're actively integrating remaining providers—those not yet integrated will be blocked once free credits are exhausted.
 
 If you have remaining credits, we estimate costs for providers that aren’t fully integrated with our billing system. These estimates are usually higher than the actual cost to prevent abuse, which is why PAYG is currently disabled for those providers.
 
-You can track your spending on your [billing page](https://huggingface.co/settings/billing).
-
 <Tip>
 
 Hugging Face charges you the same rates as the provider, with no additional fees. We just pass through the provider costs directly.
 
 </Tip>
 
-## Routed requests vs direct calls
+You can track your spending anytime on your [billing page](https://huggingface.co/settings/billing).
+
+## Routed Requests vs Direct Calls (Detailed Comparison)
 
 The documentation above assumes you are making routed requests to external providers. In practice, there are 3 different ways to run inference, each with unique billing implications:
 
@@ -51,7 +82,7 @@ For instance, a request to [black-forest-labs/FLUX.1-dev](https://huggingface.co
 
 The `"hf-inference"` provider is currently the default provider when working with the JavaScript and Python SDKs. Note that this default might change in the future.
 
-## Organization billing
+## Billing forEnterprise Hub organizations
 
 For Enterprise Hub organizations, it is possible to centralize billing for all of your users. Each user still uses their own User Access Token but the requests are billed to your organization. This can be done by passing `"X-HF-Bill-To: my-org-name"` as a header in your HTTP requests.
 

From ce0ce2d6c801bd9137abfd8278d0a341568a1ba6 Mon Sep 17 00:00:00 2001
From: burtenshaw <ben.burtenshaw@gmail.com>
Date: Mon, 30 Jun 2025 14:41:01 +0200
Subject: [PATCH 2/6] Update docs/inference-providers/pricing.md

Co-authored-by: Simon Brandeis <33657802+SBrandeis@users.noreply.github.com>
---
 docs/inference-providers/pricing.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/docs/inference-providers/pricing.md b/docs/inference-providers/pricing.md
index bcc853371..918a32b62 100644
--- a/docs/inference-providers/pricing.md
+++ b/docs/inference-providers/pricing.md
@@ -44,7 +44,6 @@ See the [Organization Billing section](#organization-billing) below for more det
 
 **PRO users and Enterprise Hub organizations** can continue using the API after exhausting their monthly credits. This ensures uninterrupted access to models for production workloads.
 
-**Current Status**: Pay-as-you-Go is available for providers integrated with our billing system. We're actively integrating remaining providers—those not yet integrated will be blocked once free credits are exhausted.
 
 If you have remaining credits, we estimate costs for providers that aren’t fully integrated with our billing system. These estimates are usually higher than the actual cost to prevent abuse, which is why PAYG is currently disabled for those providers.
 

From 1cfa117d1e6ca72bcecc529a4fa4aa1890593548 Mon Sep 17 00:00:00 2001
From: burtenshaw <ben.burtenshaw@gmail.com>
Date: Mon, 30 Jun 2025 14:44:01 +0200
Subject: [PATCH 3/6] remove mention of 'direct calls'

---
 docs/inference-providers/pricing.md | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/docs/inference-providers/pricing.md b/docs/inference-providers/pricing.md
index bcc853371..0590fa544 100644
--- a/docs/inference-providers/pricing.md
+++ b/docs/inference-providers/pricing.md
@@ -22,20 +22,19 @@ Your monthly credits automatically apply when you route requests through Hugging
 
 Inference Providers offers flexibility in how you're billed. Understanding these options upfront helps you choose the best approach for your needs:
 
-| Feature | **Routed by Hugging Face** | **Custom Provider Key** | **Direct Calls** |
-| :--- | :--- | :--- | :--- |
-| **How it Works** | Your request routes through HF to the provider | You set a custom provider key in HF settings | You provide the provider key directly in your code |
-| **Billing** | Pay-as-you-go on your HF account | Billed directly by the provider | Billed directly by the provider |
-| **Monthly Credits** | **✅ Yes** - Credits apply to eligible providers | **❌ No** - Credits don't apply | **❌ No** - Credits don't apply |
-| **Provider Account Needed** | **❌ No** - We handle everything | **✅ Yes** - You need provider accounts | **✅ Yes** - You need provider accounts |
-| **Best For** | Simplicity, experimentation, consolidated billing | More billing control, using non-integrated providers | Full control, bypassing HF routing |
-| **Integration** | SDKs, Playground, widgets, Data AI Studio | SDKs, Playground, widgets, Data AI Studio | SDKs only |
+| Feature | **Routed by Hugging Face** | **Custom Provider Key** |
+| :--- | :--- | :--- |
+| **How it Works** | Your request routes through HF to the provider | You set a custom provider key in HF settings |
+| **Billing** | Pay-as-you-go on your HF account | Billed directly by the provider |
+| **Monthly Credits** | **✅ Yes** - Credits apply to eligible providers | **❌ No** - Credits don't apply |
+| **Provider Account Needed** | **❌ No** - We handle everything | **✅ Yes** - You need provider accounts |
+| **Best For** | Simplicity, experimentation, consolidated billing | More billing control, using non-integrated providers |
+| **Integration** | SDKs, Playground, widgets, Data AI Studio | SDKs, Playground, widgets, Data AI Studio |
 
 ### Which Option Should I Choose?
 
 - **Start with Routed by Hugging Face** if you want simplicity and to use your monthly credits
 - **Use Custom Provider Key** if you need specific provider features or you're consistently using the same provider
-- **Use Direct Calls** if you want to bypass Hugging Face routing entirely
 
 ## Pay-as-you-Go Details
 
@@ -56,7 +55,7 @@ Hugging Face charges you the same rates as the provider, with no additional fees
 
 You can track your spending anytime on your [billing page](https://huggingface.co/settings/billing).
 
-## Routed Requests vs Direct Calls (Detailed Comparison)
+## Hugging Face Billing vs Custom Provider Key (Detailed Comparison)
 
 The documentation above assumes you are making routed requests to external providers. In practice, there are 3 different ways to run inference, each with unique billing implications:
 
@@ -64,15 +63,12 @@ The documentation above assumes you are making routed requests to external provi
 
 - **Routed Request with Custom Key**: In your [settings page](https://huggingface.co/settings/inference-providers) on the Hub, you can configure a custom key for each provider. To use this option, you'll need to create an account on the provider's platform, and billing will be handled directly by that provider. Hugging Face won't charge you for the call. This method gives you more control over billing when experimenting with models on the Hub. When making a routed request with a custom key, your code remains unchanged—you'll still pass your Hugging Face User Access Token. Hugging Face will automatically swap the authentication when routing the request.
 
-- **Direct Calls**: If you provide a custom key when using the JavaScript or Python `InferenceClient`, the call will be made directly to the provider's platform. Billing is managed by the provider, and Hugging Face is not notified of the request. This option is ideal if you want to use the unified `InferenceClient` interface without routing through Hugging Face.
-
 Here is a table that sums up what we've seen so far:
 
 |                                    | HF routing | Billed by    | Free-tier included | Pay-as-you-go                                   | Integration                               |
 | ---------------------------------- | ---------- | ------------ | ------------------ | ----------------------------------------------- | ----------------------------------------- |
 | **Routed request**                 | Yes        | Hugging Face | Yes                | Only for PRO users and for integrated providers | SDKs, Playground, widgets, Data AI Studio |
 | **Routed request with custom key** | Yes        | Provider     | No                 | Yes                                             | SDKs, Playground, widgets, Data AI Studio |
-| **Direct call**                    | No         | Provider     | No                 | Yes                                             | SDKs only                                 |
 
 ## HF-Inference cost
 
@@ -82,7 +78,7 @@ For instance, a request to [black-forest-labs/FLUX.1-dev](https://huggingface.co
 
 The `"hf-inference"` provider is currently the default provider when working with the JavaScript and Python SDKs. Note that this default might change in the future.
 
-## Billing forEnterprise Hub organizations
+## Billing for Enterprise Hub organizations
 
 For Enterprise Hub organizations, it is possible to centralize billing for all of your users. Each user still uses their own User Access Token but the requests are billed to your organization. This can be done by passing `"X-HF-Bill-To: my-org-name"` as a header in your HTTP requests.
 

From 5d37dc60f55a2e9cbe71c0618a6895a5492371e7 Mon Sep 17 00:00:00 2001
From: burtenshaw <ben.burtenshaw@gmail.com>
Date: Mon, 30 Jun 2025 14:45:27 +0200
Subject: [PATCH 4/6] improving price table

Co-authored-by: SBrandeis <simon@huggingface.co>
---
 docs/inference-providers/pricing.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/inference-providers/pricing.md b/docs/inference-providers/pricing.md
index 1913dc01e..0483859bd 100644
--- a/docs/inference-providers/pricing.md
+++ b/docs/inference-providers/pricing.md
@@ -6,11 +6,11 @@ Access 200+ models from leading AI inference providers with centralized, transpa
 
 Every Hugging Face user receives monthly credits to experiment with Inference Providers:
 
-| Account Type                 | Monthly Credits | Value    |
+| Account Type                 | Monthly Credits | Extra usage (pay-as-you-go)    |
 | ---------------------------- | --------------- | -------- |
-| Free Users                   | Limited credits | ~$0.10   |
-| PRO Users                    | Full access     | **$2.00** |
-| Enterprise Hub Organizations | Per-seat access | **$2.00 per seat** |
+| Free Users                   | $0.10, subject to change | no   |
+| PRO Users                    | $2.00     | yes |
+| Enterprise Hub Organizations | $2.00 per seat | yes |
 
 <Tip>
 

From d7d2e8849ed0ebd90c09064a9ce5c3f3ea1d456e Mon Sep 17 00:00:00 2001
From: burtenshaw <ben.burtenshaw@gmail.com>
Date: Mon, 30 Jun 2025 14:55:53 +0200
Subject: [PATCH 5/6] attempt to improve key explanation

---
 docs/inference-providers/pricing.md | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/docs/inference-providers/pricing.md b/docs/inference-providers/pricing.md
index 0483859bd..e5c855950 100644
--- a/docs/inference-providers/pricing.md
+++ b/docs/inference-providers/pricing.md
@@ -58,16 +58,22 @@ You can track your spending anytime on your [billing page](https://huggingface.c
 
 The documentation above assumes you are making routed requests to external providers. In practice, there are 3 different ways to run inference, each with unique billing implications:
 
-- **Routed Request**: This is the default method for using Inference Providers. Simply use the JavaScript or Python `InferenceClient`, or make raw HTTP requests with your Hugging Face User Access Token. Your request is automatically routed through Hugging Face to the provider's platform. No separate provider account is required, and billing is managed directly by Hugging Face. This approach lets you seamlessly switch between providers without additional setup.
+- **Hugging Face Routed Requests**: This is the default method for using Inference Providers. Simply use the JavaScript or Python `InferenceClient`, or make raw HTTP requests with your Hugging Face User Access Token. Your request is automatically routed through Hugging Face to the provider's platform. No separate provider account is required, and billing is managed directly by Hugging Face. This approach lets you seamlessly switch between providers without additional setup.
 
-- **Routed Request with Custom Key**: In your [settings page](https://huggingface.co/settings/inference-providers) on the Hub, you can configure a custom key for each provider. To use this option, you'll need to create an account on the provider's platform, and billing will be handled directly by that provider. Hugging Face won't charge you for the call. This method gives you more control over billing when experimenting with models on the Hub. When making a routed request with a custom key, your code remains unchanged—you'll still pass your Hugging Face User Access Token. Hugging Face will automatically swap the authentication when routing the request.
+- **Custom Provider Key**: You can bring your own provider key to use with the Inference Providers. This is useful if you already have an account with a provider and you want to use it with the Inference Providers. Hugging Face won't charge you for the call. 
 
 Here is a table that sums up what we've seen so far:
 
 |                                    | HF routing | Billed by    | Free-tier included | Pay-as-you-go                                   | Integration                               |
 | ---------------------------------- | ---------- | ------------ | ------------------ | ----------------------------------------------- | ----------------------------------------- |
-| **Routed request**                 | Yes        | Hugging Face | Yes                | Only for PRO users and for integrated providers | SDKs, Playground, widgets, Data AI Studio |
-| **Routed request with custom key** | Yes        | Provider     | No                 | Yes                                             | SDKs, Playground, widgets, Data AI Studio |
+| **Routed Requests**                 | Yes        | Hugging Face | Yes                | Only for PRO users and for integrated providers | SDKs, Playground, widgets, Data AI Studio |
+| **Custom Provider Key** | Yes        | Provider     | No                 | Yes                                             | SDKs, Playground, widgets, Data AI Studio |
+
+<Tip>
+
+You can set your custom provider key in the [settings page](https://huggingface.co/settings/inference-providers) on the Hub, or in the `InferenceClient` when using the JavaScript or Python SDKs. When making a routed request with a custom key, your code remains unchanged—you can still pass your Hugging Face User Access Token. Hugging Face will automatically swap the authentication when routing the request.
+
+</Tip>
 
 ## HF-Inference cost
 

From e538e229f3d63d53831e0245decc87c6ece7e9c7 Mon Sep 17 00:00:00 2001
From: burtenshaw <ben.burtenshaw@gmail.com>
Date: Tue, 1 Jul 2025 14:28:26 +0200
Subject: [PATCH 6/6] Update docs/inference-providers/pricing.md

Co-authored-by: vb <vaibhavs10@gmail.com>
---
 docs/inference-providers/pricing.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/inference-providers/pricing.md b/docs/inference-providers/pricing.md
index e5c855950..ab06bf5a7 100644
--- a/docs/inference-providers/pricing.md
+++ b/docs/inference-providers/pricing.md
@@ -83,7 +83,7 @@ For instance, a request to [black-forest-labs/FLUX.1-dev](https://huggingface.co
 
 The `"hf-inference"` provider is currently the default provider when working with the JavaScript and Python SDKs. Note that this default might change in the future.
 
-## Billing for Enterprise Hub organizations
+## Billing for Team and Enterprise organizations
 
 For Enterprise Hub organizations, it is possible to centralize billing for all of your users. Each user still uses their own User Access Token but the requests are billed to your organization. This can be done by passing `"X-HF-Bill-To: my-org-name"` as a header in your HTTP requests.