You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/inference-providers/pricing.md
+10-14Lines changed: 10 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,20 +22,19 @@ Your monthly credits automatically apply when you route requests through Hugging
22
22
23
23
Inference Providers offers flexibility in how you're billed. Understanding these options upfront helps you choose the best approach for your needs:
24
24
25
-
| Feature |**Routed by Hugging Face**|**Custom Provider Key**|**Direct Calls**|
26
-
| :--- | :--- | :--- | :--- |
27
-
|**How it Works**| Your request routes through HF to the provider | You set a custom provider key in HF settings | You provide the provider key directly in your code |
28
-
|**Billing**| Pay-as-you-go on your HF account | Billed directly by the provider | Billed directly by the provider |
|**Provider Account Needed**|**❌ No** - We handle everything |**✅ Yes** - You need provider accounts |
31
+
|**Best For**| Simplicity, experimentation, consolidated billing | More billing control, using non-integrated providers |
32
+
|**Integration**| SDKs, Playground, widgets, Data AI Studio | SDKs, Playground, widgets, Data AI Studio |
33
33
34
34
### Which Option Should I Choose?
35
35
36
36
-**Start with Routed by Hugging Face** if you want simplicity and to use your monthly credits
37
37
-**Use Custom Provider Key** if you need specific provider features or you're consistently using the same provider
38
-
-**Use Direct Calls** if you want to bypass Hugging Face routing entirely
39
38
40
39
## Pay-as-you-Go Details
41
40
@@ -56,23 +55,20 @@ Hugging Face charges you the same rates as the provider, with no additional fees
56
55
57
56
You can track your spending anytime on your [billing page](https://huggingface.co/settings/billing).
58
57
59
-
## Routed Requests vs Direct Calls (Detailed Comparison)
58
+
## Hugging Face Billing vs Custom Provider Key (Detailed Comparison)
60
59
61
60
The documentation above assumes you are making routed requests to external providers. In practice, there are 3 different ways to run inference, each with unique billing implications:
62
61
63
62
-**Routed Request**: This is the default method for using Inference Providers. Simply use the JavaScript or Python `InferenceClient`, or make raw HTTP requests with your Hugging Face User Access Token. Your request is automatically routed through Hugging Face to the provider's platform. No separate provider account is required, and billing is managed directly by Hugging Face. This approach lets you seamlessly switch between providers without additional setup.
64
63
65
64
-**Routed Request with Custom Key**: In your [settings page](https://huggingface.co/settings/inference-providers) on the Hub, you can configure a custom key for each provider. To use this option, you'll need to create an account on the provider's platform, and billing will be handled directly by that provider. Hugging Face won't charge you for the call. This method gives you more control over billing when experimenting with models on the Hub. When making a routed request with a custom key, your code remains unchanged—you'll still pass your Hugging Face User Access Token. Hugging Face will automatically swap the authentication when routing the request.
66
65
67
-
-**Direct Calls**: If you provide a custom key when using the JavaScript or Python `InferenceClient`, the call will be made directly to the provider's platform. Billing is managed by the provider, and Hugging Face is not notified of the request. This option is ideal if you want to use the unified `InferenceClient` interface without routing through Hugging Face.
68
-
69
66
Here is a table that sums up what we've seen so far:
70
67
71
68
|| HF routing | Billed by | Free-tier included | Pay-as-you-go | Integration |
|**Routed request**| Yes | Hugging Face | Yes | Only for PRO users and for integrated providers | SDKs, Playground, widgets, Data AI Studio |
74
71
|**Routed request with custom key**| Yes | Provider | No | Yes | SDKs, Playground, widgets, Data AI Studio |
75
-
|**Direct call**| No | Provider | No | Yes | SDKs only |
76
72
77
73
## HF-Inference cost
78
74
@@ -82,7 +78,7 @@ For instance, a request to [black-forest-labs/FLUX.1-dev](https://huggingface.co
82
78
83
79
The `"hf-inference"` provider is currently the default provider when working with the JavaScript and Python SDKs. Note that this default might change in the future.
84
80
85
-
## Billing forEnterprise Hub organizations
81
+
## Billing for Enterprise Hub organizations
86
82
87
83
For Enterprise Hub organizations, it is possible to centralize billing for all of your users. Each user still uses their own User Access Token but the requests are billed to your organization. This can be done by passing `"X-HF-Bill-To: my-org-name"` as a header in your HTTP requests.
0 commit comments