|
2 | 2 |
|
3 | 3 | Please refer to the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers) for detailed information. |
4 | 4 |
|
| 5 | +## What is HF-Inference API? |
5 | 6 |
|
6 | | -## What technology do you use to power the HF-Inference API? |
7 | | - |
8 | | -For 🤗 Transformers models, [Pipelines](https://huggingface.co/docs/transformers/main_classes/pipelines) power the API. |
| 7 | +HF-Inference API is one of the many providers available on the Hugging Face Hub. |
| 8 | +It is deployed by Hugging Face ourselves, using text-generation-inference for LLMs for instance. This service used to be called “Inference API (serverless)” prior to Inference Providers. |
9 | 9 |
|
10 | | -On top of `Pipelines` and depending on the model type, there are several production optimizations like: |
11 | | -- compiling models to optimized intermediary representations (e.g. [ONNX](https://medium.com/microsoftazure/accelerate-your-nlp-pipelines-using-hugging-face-transformers-and-onnx-runtime-2443578f4333)), |
12 | | -- maintaining a Least Recently Used cache, ensuring that the most popular models are always loaded, |
13 | | -- scaling the underlying compute infrastructure on the fly depending on the load constraints. |
| 10 | +For more details about the HF-Inference API, check out it's [dedicated page](https://huggingface.co/docs/inference-providers/providers/hf-inference). |
14 | 11 |
|
15 | | -For models from [other libraries](./models-libraries), the API uses [Starlette](https://www.starlette.io) and runs in [Docker containers](https://github.com/huggingface/api-inference-community/tree/main/docker_images). Each library defines the implementation of [different pipelines](https://github.com/huggingface/api-inference-community/tree/main/docker_images/sentence_transformers/app/pipelines). |
16 | | - |
17 | | -## How can I turn off the HF-Inference API for my model? |
| 12 | +## What technology do you use to power the HF-Inference API? |
18 | 13 |
|
19 | | -Specify `inference: false` in your model card's metadata. |
| 14 | +The HF-Inference API is powered by [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) under the hood. |
20 | 15 |
|
21 | 16 | ## Why don't I see an inference widget, or why can't I use the API? |
22 | 17 |
|
23 | | -For some tasks, there might not be support in the HF-Inference API, and, hence, there is no widget. |
24 | | -For all libraries (except 🤗 Transformers), there is a [library-to-tasks.ts file](https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/library-to-tasks.ts) of supported tasks in the API. When a model repository has a task that is not supported by the repository library, the repository has `inference: false` by default. |
25 | | - |
26 | | -## Can I send large volumes of requests? Can I get accelerated APIs? |
27 | | - |
28 | | -If you are interested in accelerated inference, higher volumes of requests, or an SLA, please contact us at `api-enterprise at huggingface.co`. |
| 18 | +For some tasks, there might not be support by any Inference Provider, and hence, there is no widget. |
29 | 19 |
|
30 | 20 | ## How can I see my usage? |
31 | 21 |
|
32 | | -You can check your usage in the [Inference Dashboard](https://ui.endpoints.huggingface.co/endpoints). The dashboard shows both your serverless and dedicated endpoints usage. |
| 22 | +To check usage across all providers, check out your [billing page](https://huggingface.co/settings/billing). |
| 23 | + |
| 24 | +To check your HF-Inference usage specifically, check out the [Inference Dashboard](https://ui.endpoints.huggingface.co/endpoints). The dashboard shows both your serverless and dedicated endpoints usage. |
33 | 25 |
|
34 | | -## Is there programmatic access to the HF-Inference API? |
| 26 | +## Is there programmatic access to Inference Providers? |
35 | 27 |
|
36 | | -Yes, the `huggingface_hub` library has a client wrapper documented [here](https://huggingface.co/docs/huggingface_hub/guides/inference). |
| 28 | +Yes! We provide client wrappers in both JS and Python: |
| 29 | +- [JS (`@huggingface/inference`)](https://huggingface.co/docs/huggingface.js/inference/classes/InferenceClient) |
| 30 | +- [Python (`huggingface_hub`)](https://huggingface.co/docs/huggingface_hub/guides/inference) |
0 commit comments