Skip to content

Commit a1a1f6f

Browse files
committed
mention the current focus of hf-inference
1 parent c8c370b commit a1a1f6f

File tree

2 files changed

+3
-0
lines changed

2 files changed

+3
-0
lines changed

docs/inference-providers/pricing.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ As you may have noticed, you can select to work with `"hf-inference"` provider.
7979

8080
For instance, a request to [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) that takes 10 seconds to complete on a GPU machine that costs $0.00012 per second to run, will be billed $0.0012.
8181

82+
As of July 2025, hf-inference focuses mostly on CPU inference (e.g. embedding, text-ranking, text-classification, or smaller LLMs that have historical importance like BERT or GPT-2).
8283

8384
## Billing for Team and Enterprise organizations
8485

scripts/inference-providers/templates/providers/hf-inference.handlebars

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,6 @@ All supported HF Inference models can be found [here](https://huggingface.co/mod
1313
HF Inference is the serverless Inference API powered by Hugging Face. This service used to be called "Inference API (serverless)" prior to Inference Providers.
1414
If you are interested in deploying models to a dedicated and autoscaling infrastructure managed by Hugging Face, check out [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) instead.
1515

16+
As of July 2025, hf-inference focuses mostly on CPU inference (e.g. embedding, text-ranking, text-classification, or smaller LLMs that have historical importance like BERT or GPT-2).
17+
1618
{{{tasksSection}}}

0 commit comments

Comments
 (0)