rename Inference Provider API to Inference Providers

Wauplin · Wauplin · commit b18edddfc1eb · 2025-03-28T18:03:50.000+01:00
diff --git a/docs/api-inference/_toctree.yml b/docs/api-inference/_toctree.yml
@@ -1,7 +1,7 @@
 - title: Get Started
   sections:
   - local: index
-    title: Inference Providers API
+    title: Inference Providers
   - local: pricing
     title: Pricing and Billing
   - local: hub-integration
diff --git a/docs/api-inference/hub-integration.md b/docs/api-inference/hub-integration.md
@@ -1,6 +1,6 @@
 # Hub Integration
 
-The Inference Providers API is tightly integrated with the Hugging Face Hub. No matter in which service you use it, the usage and billing will be centralized on your Hugging Face account.
+The Inference Providers is tightly integrated with the Hugging Face Hub. No matter in which service you use it, the usage and billing will be centralized on your Hugging Face account.
 
 ## Model search
 
@@ -20,7 +20,7 @@ It is also possible to select multiple providers or even all of them to filter a
 
 ## Features using Inference Providers
 
-Several Hugging Face features utilize the Inference Providers API and count towards your monthly credits. The included monthly credits for PRO and Enterprise should cover moderate usage of these features for most users.
+Several Hugging Face features utilize the Inference Providers and count towards your monthly credits. The included monthly credits for PRO and Enterprise should cover moderate usage of these features for most users.
 
 - [Inference Widgets](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324): Interactive widgets available on model pages. This is the entry point to quickly test a model on the Hub.
 
diff --git a/docs/api-inference/index.md b/docs/api-inference/index.md
@@ -1,8 +1,8 @@
-# Inference Providers API
+# Inference Providers
 
-The Hugging Face Inference Providers API revolutionizes how developers access and run machine learning models by offering a unified, flexible interface to multiple serverless inference providers. This new approach extends our previous Serverless Inference API, providing more models, increased performances and better reliability thanks to our awesome partners.
+The Hugging Face Inference Providers revolutionizes how developers access and run machine learning models by offering a unified, flexible interface to multiple serverless inference providers. This new approach extends our previous Serverless Inference API, providing more models, increased performances and better reliability thanks to our awesome partners.
 
-To learn more about the launch of the Inference Providers API, check out our [announcement blog post](https://huggingface.co/blog/inference-providers).
+To learn more about the launch of the Inference Providers, check out our [announcement blog post](https://huggingface.co/blog/inference-providers).
 
 ## Why use the Inference Provider API?
 
@@ -33,13 +33,13 @@ To get started quickly with [Chat Completion models](http://huggingface.co/model
 
 ## Get Started
 
-You can call the Inference Providers API with your preferred tools, such as Python, JavaScript, or cURL. To simplify integration, we offer both a Python SDK (`huggingface_hub`) and a JavaScript SDK (`huggingface.js`).
+You can call the Inference Providers with your preferred tools, such as Python, JavaScript, or cURL. To simplify integration, we offer both a Python SDK (`huggingface_hub`) and a JavaScript SDK (`huggingface.js`).
 
 In this section, we will demonstrate a simple example using [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324), a conversational Large Language Model. For the example, we will use [Novita AI](https://novita.ai/) as Inference Provider with routed requests. You will learn what that means in the next chapters.
 
 ### Authentication
 
-The Inference Providers API requires passing a user token in the request headers. You can generate a token by signing up on the Hugging Face website and going to the [settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). We recommend creating a `fine-grained` token with the scope to `Make calls to Inference Providers`.
+The Inference Providers requires passing a user token in the request headers. You can generate a token by signing up on the Hugging Face website and going to the [settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). We recommend creating a `fine-grained` token with the scope to `Make calls to Inference Providers`.
 
 For more details about user tokens, check out [this guide](https://huggingface.co/docs/hub/en/security-tokens).
 
diff --git a/docs/api-inference/pricing.md b/docs/api-inference/pricing.md
@@ -1,6 +1,6 @@
 # Pricing and Billing
 
-The Inference Providers API is a production-ready service involving external partners and is therefore a paid-product. However, as an Hugging Face user you get monthly credits to run experiments. The amount of credits you get depends on your type of account:
+The Inference Providers is a production-ready service involving external partners and is therefore a paid-product. However, as an Hugging Face user you get monthly credits to run experiments. The amount of credits you get depends on your type of account:
 
 | User Tier                | Included monthly credits           |
 | ------------------------ | ---------------------------------- |
@@ -25,7 +25,7 @@ Hugging Face charges you the same rates as the provider, with no additional fees
 
 The documentation above assumes you are making routed requests to external providers. In practice, there are 3 different ways to run inference, each with unique billing implications:
 
-- **Routed Request**: This is the default method for using the Inference Providers API. Simply use the JavaScript or Python `InferenceClient`, or make raw HTTP requests with your Hugging Face User Access Token. Your request is automatically routed through Hugging Face to the provider's platform. No separate provider account is required, and billing is managed directly by Hugging Face. This approach lets you seamlessly switch between providers without additional setup.
+- **Routed Request**: This is the default method for using the Inference Providers. Simply use the JavaScript or Python `InferenceClient`, or make raw HTTP requests with your Hugging Face User Access Token. Your request is automatically routed through Hugging Face to the provider's platform. No separate provider account is required, and billing is managed directly by Hugging Face. This approach lets you seamlessly switch between providers without additional setup.
 
 - **Routed Request with Custom Key**: In your [settings page](https://huggingface.co/settings/inference-providers) on the Hub, you can configure a custom key for each provider. To use this option, you'll need to create an account on the provider's platform, and billing will be handled directly by that provider. Hugging Face won't charge you for the call. This method gives you more control over billing when experimenting with models on the Hub. When making a routed request with a custom key, your code remains unchanged—you'll still pass your Hugging Face User Access Token. Hugging Face will automatically swap the authentication when routing the request.