Skip to content

Commit cc6522c

Browse files
hanouticelinamintyleaf
authored andcommitted
[Docs] Remove Inference API references in docs (huggingface#3197)
* remove inference api references * better * better wording
1 parent 5841172 commit cc6522c

File tree

8 files changed

+13
-17
lines changed

8 files changed

+13
-17
lines changed

README.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,6 @@ The advantages are:
146146

147147
- Free model or dataset hosting for libraries and their users.
148148
- Built-in file versioning, even with very large files, thanks to a git-based approach.
149-
- Serverless inference API for all models publicly available.
150149
- In-browser widgets to play with the uploaded models.
151150
- Anyone can upload a new model for your library, they just need to add the corresponding tag for the model to be discoverable.
152151
- Fast downloads! We use Cloudfront (a CDN) to geo-replicate downloads so they're blazing fast from anywhere on the globe.

docs/source/de/guides/integrations.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,7 @@ Es gibt vier Hauptwege, eine Bibliothek mit dem Hub zu integrieren:
1111
Dies beinhaltet das Modellgewicht sowie [die Modellkarte](https://huggingface.co/docs/huggingface_hub/how-to-model-cards) und alle anderen relevanten Informationen oder Daten, die für den Betrieb des Modells erforderlich sind (zum Beispiel Trainingsprotokolle). Diese Methode wird oft `push_to_hub()` genannt.
1212
2. **Download from Hub**: Implementieren Sie eine Methode, um ein Modell vom Hub zu laden.
1313
Die Methode sollte die Modellkonfiguration/-gewichte herunterladen und das Modell laden. Diese Methode wird oft `from_pretrained` oder `load_from_hub()` genannt.
14-
3. **Inference API**: Nutzen Sie unsere Server, um Inferenz auf von Ihrer Bibliothek unterstützten Modellen kostenlos auszuführen.
15-
4. **Widgets**: Zeigen Sie ein Widget auf der Landing Page Ihrer Modelle auf dem Hub an.
14+
3. **Widgets**: Zeigen Sie ein Widget auf der Landing Page Ihrer Modelle auf dem Hub an.
1615
Dies ermöglicht es Benutzern, ein Modell schnell aus dem Browser heraus auszuprobieren.
1716

1817
In diesem Leitfaden konzentrieren wir uns auf die ersten beiden Themen. Wir werden die beiden Hauptansätze vorstellen, die Sie zur Integration einer Bibliothek verwenden können, mit ihren Vor- und Nachteilen. Am Ende des Leitfadens ist alles zusammengefasst, um Ihnen bei der Auswahl zwischen den beiden zu helfen. Bitte beachten Sie, dass dies nur Richtlinien sind, die Sie an Ihre Anforderungen anpassen können.

docs/source/en/guides/integrations.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,7 @@ There are four main ways to integrate a library with the Hub:
1515
or data necessary to run the model (for example, training logs). This method is often called `push_to_hub()`.
1616
2. **Download from Hub:** implement a method to load a model from the Hub. The method should download the model
1717
configuration/weights and load the model. This method is often called `from_pretrained` or `load_from_hub()`.
18-
3. **Inference API:** use our servers to run inference on models supported by your library for free.
19-
4. **Widgets:** display a widget on the landing page of your models on the Hub. It allows users to quickly try a model
18+
3. **Widgets:** display a widget on the landing page of your models on the Hub. It allows users to quickly try a model
2019
from the browser.
2120

2221
In this guide, we will focus on the first two topics. We will present the two main approaches you can use to integrate

docs/source/en/guides/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ Take a look at these guides to learn how to use huggingface_hub to solve real-wo
6060
<div class="w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">
6161
Inference
6262
</div><p class="text-gray-700">
63-
How to make predictions using the HF Inference API and other Inference Providers?
63+
How to make predictions using Hugging Face Inference Providers?
6464
</p>
6565
</a>
6666

docs/source/en/index.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,7 @@ do all these things with Python.
1414
Read the [quick start guide](quick-start) to get up and running with the
1515
`huggingface_hub` library. You will learn how to download files from the Hub, create a
1616
repository, and upload files to the Hub. Keep reading to learn more about how to manage
17-
your repositories on the 🤗 Hub, how to interact in discussions or even how to access
18-
the Inference API.
17+
your repositories on the 🤗 Hub, how to interact in discussions or even how to run inference.
1918

2019
<div class="mt-10">
2120
<div class="w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5">

docs/source/en/package_reference/inference_client.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,12 @@ rendered properly in your Markdown viewer.
44

55
# Inference
66

7-
Inference is the process of using a trained model to make predictions on new data. Because this process can be compute-intensive, running on a dedicated or external service can be an interesting option.
8-
The `huggingface_hub` library provides a unified interface to run inference across multiple services for models hosted on the Hugging Face Hub:
9-
1. [Inference API](https://huggingface.co/docs/api-inference/index): a serverless solution that allows you to run accelerated inference on Hugging Face's infrastructure for free. This service is a fast way to get started, test different models, and prototype AI products.
10-
2. Third-party providers: various serverless solution provided by external providers (Together, Sambanova, etc.). These providers offer production-ready APIs on a pay-a-you-go model. This is the fastest way to integrate AI in your products with a maintenance-free and scalable solution. Refer to the [Supported providers and tasks](../guides/inference#supported-providers-and-tasks) section for a list of supported providers.
11-
3. [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index): a product to easily deploy models to production. Inference is run by Hugging Face in a dedicated, fully managed infrastructure on a cloud provider of your choice.
7+
Inference is the process of using a trained model to make predictions on new data. Because this process can be compute-intensive, running on a dedicated or external service can be an interesting option.
8+
The `huggingface_hub` library provides a unified interface to run inference across multiple services for models hosted on the Hugging Face Hub:
9+
10+
1. [Inference Providers](https://huggingface.co/docs/inference-providers/index): a streamlined, unified access to hundreds of machine learning models, powered by our serverless inference partners. This new approach builds on our previous Serverless Inference API, offering more models, improved performance, and greater reliability thanks to world-class providers. Refer to the [documentation](https://huggingface.co/docs/inference-providers/index#partners) for a list of supported providers.
11+
2. [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index): a product to easily deploy models to production. Inference is run by Hugging Face in a dedicated, fully managed infrastructure on a cloud provider of your choice.
12+
3. Local endpoints: you can also run inference with local inference servers like [llama.cpp](https://github.com/ggerganov/llama.cpp), [Ollama](https://ollama.com/), [vLLM](https://github.com/vllm-project/vllm), [LiteLLM](https://docs.litellm.ai/docs/simple_proxy), or [Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) by connecting the client to these local endpoints.
1213

1314
These services can be called with the [`InferenceClient`] object. Please refer to [this guide](../guides/inference)
1415
for more information on how to use it.

docs/source/en/quick-start.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -197,4 +197,4 @@ Hub, we recommend reading our [how-to guides](./guides/overview) to:
197197
- [Download](./guides/download) files from the Hub.
198198
- [Upload](./guides/upload) files to the Hub.
199199
- [Search the Hub](./guides/search) for your desired model or dataset.
200-
- [Access the Inference API](./guides/inference) for fast inference.
200+
- [Run Inference](./guides/inference) across multiple services for models hosted on the Hugging Face Hub.

docs/source/fr/guides/integrations.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,11 @@ Des [dizaines de librairies](https://huggingface.co/docs/hub/models-libraries) s
1010
Il existe quatre façons principales d'intégrer une bibliothèque au Hub :
1111
1. **Push to Hub** implémente une méthode pour upload un modèle sur le Hub. Cela inclut les paramètres du modèle, sa fiche descriptive (appelée [Model Card](https://huggingface.co/docs/huggingface_hub/how-to-model-cards)) et toute autre information pertinente liée au modèle (par exemple, les logs d'entraînement). Cette méthode est souvent appelée `push_to_hub()`.
1212
2. **Download from Hub** implémente une méthode pour charger un modèle depuis le Hub. La méthode doit télécharger la configuration et les poids du modèle puis instancier celui-ci. Cette méthode est souvent appelée `from_pretrained` ou `load_from_hub()`.
13-
3. **Inference API** utilise nos serveurs pour faire de l'inférence gratuitement sur des modèles supportés par votre librairie.
14-
4. **Widgets** affiche un widget sur la page d'accueil de votre modèle dans le Hub. Les widgets permettent aux utilisateurs de rapidement tester un modèle depuis le navigateur.
13+
3. **Widgets** affiche un widget sur la page d'accueil de votre modèle dans le Hub. Les widgets permettent aux utilisateurs de rapidement tester un modèle depuis le navigateur.
1514

1615
Dans ce guide, nous nous concentrerons sur les deux premiers sujets. Nous présenterons les deux approches principales que vous pouvez utiliser pour intégrer une librairie, avec leurs avantages et leurs inconvénients. Tout est résumé à la fin du guide pour vous aider à choisir entre les deux. Veuillez garder à l'esprit que ce ne sont que des conseils, et vous êtes libres de les adapter à votre cas d'usage.
1716

18-
Si l'Inference API et les Widgets vous intéressent, vous pouvez suivre [ce guide](https://huggingface.co/docs/hub/models-adding-libraries#set-up-the-inference-api). Dans les deux cas, vous pouvez nous contacter si vous intégrez une librairie au Hub et que vous voulez être listé [dans la documentation officielle](https://huggingface.co/docs/hub/models-libraries).
17+
Si les Widgets vous intéressent, vous pouvez suivre [ce guide](https://huggingface.co/docs/hub/models-adding-libraries#set-up-the-inference-api). Dans les deux cas, vous pouvez nous contacter si vous intégrez une librairie au Hub et que vous voulez être listé [dans la documentation officielle](https://huggingface.co/docs/hub/models-libraries).
1918

2019
## Une approche flexible: les helpers
2120

0 commit comments

Comments
 (0)