huggingface · mishig25 · Sep 24, 2025 · Sep 24, 2025
diff --git a/docs/inference-providers/guides/building-first-app.md b/docs/inference-providers/guides/building-first-app.md
@@ -55,11 +55,8 @@ export HF_TOKEN="your_token_here"
 const HF_TOKEN = process.env.HF_TOKEN;
 ```
 
-<Tip warning={true}>
-
-When we deploy our app to Hugging Face Spaces, we'll need to add our token as a secret. This is a secure way to handle the token and avoid exposing it in the code.
-
-</Tip>
+> [!WARNING]
+> When we deploy our app to Hugging Face Spaces, we'll need to add our token as a secret. This is a secure way to handle the token and avoid exposing it in the code.
 
 </hfoption>
 </hfoptions>
@@ -179,11 +176,8 @@ We'll also need to implement the `transcribe` and `summarize` functions.
 
 Now let's implement the transcription using OpenAI's `whisper-large-v3` model for fast, reliable speech processing.
 
-<Tip>
-
-We'll use the `auto` provider to automatically select the first available provider for the model. You can define your own priority list of providers in the [Inference Providers](https://huggingface.co/settings/inference-providers) page.
-
-</Tip>
+> [!TIP]
+> We'll use the `auto` provider to automatically select the first available provider for the model. You can define your own priority list of providers in the [Inference Providers](https://huggingface.co/settings/inference-providers) page.
 
 ```python
 def transcribe_audio(audio_file_path):
@@ -205,11 +199,8 @@ def transcribe_audio(audio_file_path):
 
 Now let's implement the transcription using OpenAI's `whisper-large-v3` model for fast, reliable speech processing.
 
-<Tip>
-
-We'll use the `auto` provider to automatically select the first available provider for the model. You can define your own priority list of providers in the [Inference Providers](https://huggingface.co/settings/inference-providers) page.
-
-</Tip>
+> [!TIP]
+> We'll use the `auto` provider to automatically select the first available provider for the model. You can define your own priority list of providers in the [Inference Providers](https://huggingface.co/settings/inference-providers) page.
 
 ```javascript
 import { InferenceClient } from 'https://esm.sh/@huggingface/inference';

diff --git a/docs/inference-providers/guides/first-api-call.md b/docs/inference-providers/guides/first-api-call.md
@@ -11,11 +11,8 @@ Many developers avoid using open source AI models because they assume deployment
 
 We're going to use the [FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) model, which is a powerful text-to-image model.
 
-<Tip>
-
-This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co).
-
-</Tip>
+> [!TIP]
+> This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co).
 
 ## Step 1: Find a Model on the Hub
 
@@ -39,11 +36,8 @@ Here, you can test the model directly in the browser from any of the available p
 
 This widget uses the same endpoint you're about to implement in code.
 
-<Tip warning={true}>
-
-You'll need a Hugging Face account (free at [huggingface.co](https://huggingface.co)) and remaining credits to use the model.
-
-</Tip>
+> [!WARNING]
+> You'll need a Hugging Face account (free at [huggingface.co](https://huggingface.co)) and remaining credits to use the model.
 
 ## Step 3: From Clicks to Code
 
@@ -59,11 +53,8 @@ Set your token as an environment variable:
 export HF_TOKEN="your_token_here"
 ```
 
-<Tip>
-
-You can add this line to your `.bash_profile` or similar file for all your terminal environments to automatically source the token.
-
-</Tip>
+> [!TIP]
+> You can add this line to your `.bash_profile` or similar file for all your terminal environments to automatically source the token.
 
 The Python or TypeScript code snippet will use the token from the environment variable.
 

diff --git a/docs/inference-providers/guides/function-calling.md b/docs/inference-providers/guides/function-calling.md
@@ -4,11 +4,8 @@ Function calling enables language models to interact with external tools and API
 
 When you provide a language model that has been fine-tuned to use tools with function descriptions, it can decide when to call these functions based on user requests, execute them, and incorporate the results into natural language responses. For example, you can build an assistant that fetches real-time weather data to provide accurate responses.
 
-<Tip>
-
-This guide assumes you have a Hugging Face account and access token. You can create a free account at [huggingface.co](https://huggingface.co) and get your token from your [settings page](https://huggingface.co/settings/tokens).
-
-</Tip>
+> [!TIP]
+> This guide assumes you have a Hugging Face account and access token. You can create a free account at [huggingface.co](https://huggingface.co) and get your token from your [settings page](https://huggingface.co/settings/tokens).
 
 ## Defining Functions
 
@@ -126,11 +123,8 @@ response = client.chat.completions.create(
 response_message = response.choices[0].message
 ```
 
-<Tip>
-
-The `tool_choice` parameter is used to control when the model calls functions. In this case, we're using `auto`, which means the model will decide when to call functions (0 or more times). Below we'll expand on `tool_choice` and other parameters.
-
-</Tip>
+> [!TIP]
+> The `tool_choice` parameter is used to control when the model calls functions. In this case, we're using `auto`, which means the model will decide when to call functions (0 or more times). Below we'll expand on `tool_choice` and other parameters.
 
 Next, we need to check in the model response where the model decided to call any functions. If it did, we need to execute the function and add the result to the conversation, before we send the final response to the user.
 
@@ -172,11 +166,8 @@ else:
 
 The workflow is straightforward: make an initial API call with your tools, check if the model wants to call functions, execute them if needed, add the results to the conversation, and get the final response for the user.
 
-<Tip warning={true}>
-
-We have handled the case where the model wants to call a function and that the function actually exists. However, models might try to call functions that don’t exist, so we need to account for that as well. We can also deal with this using `strict` mode, which we'll cover later.
-
-</Tip>
+> [!WARNING]
+> We have handled the case where the model wants to call a function and that the function actually exists. However, models might try to call functions that don’t exist, so we need to account for that as well. We can also deal with this using `strict` mode, which we'll cover later.
 
 ## Multiple Functions
 
@@ -341,11 +332,8 @@ client = InferenceClient(
 
 By switching provider, you can see the model's response change because each provider uses a different configuration of the model.
 
-<Tip warning={true}>
-
-Each inference provider has different capabilities and performance characteristics. You can find more information about each provider in the [Inference Providers](/inference-providers/index#partners) section.
-
-</Tip>
+> [!WARNING]
+> Each inference provider has different capabilities and performance characteristics. You can find more information about each provider in the [Inference Providers](/inference-providers/index#partners) section.
 
 ### Tool Choice Options
 
@@ -402,11 +390,8 @@ Here, we're forcing the model to call the `get_current_weather` function, and no
 
 <hfoption id="huggingface_hub">
 
-<Tip warning={true}>
-
-Currently, `huggingface_hub.InferenceClient` does not support the `tool_choice` parameters that specify which function to call.
-
-</Tip>
+> [!WARNING]
+> Currently, `huggingface_hub.InferenceClient` does not support the `tool_choice` parameters that specify which function to call.
 
 </hfoption>
 
@@ -440,11 +425,8 @@ tools = [
 
 Strict mode ensures that function arguments match your schema exactly: no additional properties are allowed, all required parameters must be provided, and data types are strictly enforced.
 
-<Tip warning={true}>
-
-Strict mode is not supported by all providers. You can check the provider's documentation to see if it supports strict mode.
-
-</Tip>
+> [!WARNING]
+> Strict mode is not supported by all providers. You can check the provider's documentation to see if it supports strict mode.
 
 ### Streaming Responses
 
@@ -473,11 +455,8 @@ for chunk in stream:
 
 Streaming allows you to process responses as they arrive, show real-time progress to users, and handle long-running function calls more efficiently.
 
-<Tip warning={true}>
-
-Streaming is not supported by all providers. You can check the provider's documentation to see if it supports streaming, or you can refer to this [dynamic model compatibility table](https://huggingface.co/inference-providers/models).
-
-</Tip>
+> [!WARNING]
+> Streaming is not supported by all providers. You can check the provider's documentation to see if it supports streaming, or you can refer to this [dynamic model compatibility table](https://huggingface.co/inference-providers/models).
 
 ## Next Steps
 

diff --git a/docs/inference-providers/guides/gpt-oss.md b/docs/inference-providers/guides/gpt-oss.md
@@ -16,11 +16,8 @@ Both models are supported on Inference Providers and can be accessed through eit
 export HF_TOKEN="your_token_here"
 ```
 
-<Tip>
-
-💡 Pro tip: The free tier gives you monthly inference credits to start building and experimenting. Upgrade to [Hugging Face PRO](https://huggingface.co/pro) for even more flexibility, $2 in monthly credits plus pay‑as‑you‑go access to all providers!
-
-</Tip>
+> [!TIP]
+> 💡 Pro tip: The free tier gives you monthly inference credits to start building and experimenting. Upgrade to [Hugging Face PRO](https://huggingface.co/pro) for even more flexibility, $2 in monthly credits plus pay‑as‑you‑go access to all providers!
 
 2. Install the official OpenAI SDK.
 
@@ -293,11 +290,8 @@ Key Advantages:
 - Stateful, Event-Driven Architecture: Features a stateful, event-driven architecture. Instead of resending the entire text on every update, it streams semantic events that describe only the precise change (the "delta"). This eliminates the need for manual state tracking.
 - Simplified Development for Complex Logic: The event-driven model makes it easier to build reliable applications with multi-step logic. Your code simply listens for specific events, leading to cleaner and more robust integrations.
 
-<Tip>
-
-The implementation is based on the open-source [huggingface/responses.js](https://github.com/huggingface/responses.js) project.
-
-</Tip>
+> [!TIP]
+> The implementation is based on the open-source [huggingface/responses.js](https://github.com/huggingface/responses.js) project.
 
 ### Stream responses
 

diff --git a/docs/inference-providers/guides/image-editor.md b/docs/inference-providers/guides/image-editor.md
@@ -9,11 +9,8 @@ Our app will:
 3. **Transform images** using Qwen Image Edit or FLUX.1 Kontext
 4. **Display results** in a Gradio interface
 
-<Tip>
-
-TL;DR - this guide will show you how to build an AI image editor with Gradio and Inference Providers, just like [this one](https://huggingface.co/spaces/Qwen/Qwen-Image-Edit).
-
-</Tip>
+> [!TIP]
+> TL;DR - this guide will show you how to build an AI image editor with Gradio and Inference Providers, just like [this one](https://huggingface.co/spaces/Qwen/Qwen-Image-Edit).
 
 ## Step 1: Set Up Authentication
 
@@ -24,11 +21,8 @@ Before we start coding, authenticate with Hugging Face using your token:
 export HF_TOKEN="your_token_here"
 ```
 
-<Tip>
-
-This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co).
-
-</Tip>
+> [!TIP]
+> This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co).
 
 When you set this environment variable, it handles authentication automatically for all your inference calls. You can generate a token from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained).
 
@@ -50,11 +44,8 @@ uv add huggingface-hub>=0.34.4 gradio>=5.0.0 pillow>=11.3.0
 
 The dependencies are now installed and ready to use! Also, `uv` will maintain the `pyproject.toml` file for you as you add dependencies.
 
-<Tip>
-
-We're using `uv` because it's a fast Python package manager that handles dependency resolution and virtual environment management automatically. It's much faster than pip and provides better dependency resolution. If you're not familiar with `uv`, check it out [here](https://docs.astral.sh/uv/).
-
-</Tip>
+> [!TIP]
+> We're using `uv` because it's a fast Python package manager that handles dependency resolution and virtual environment management automatically. It's much faster than pip and provides better dependency resolution. If you're not familiar with `uv`, check it out [here](https://docs.astral.sh/uv/).
 
 ## Step 3: Build the Core Image Editing Function
 
@@ -115,18 +106,15 @@ def edit_image(input_image, prompt):
         return input_image
 ```
 
-<Tip>
-
-We're using the `fal-ai` provider with the `Qwen/Qwen-Image-Edit` model. The fal-ai provider offers fast inference times, perfect for interactive applications.
-
-However, you can experiment with different providers for various performance characteristics:
-
-```python
-client = InferenceClient(provider="replicate", api_key=os.environ["HF_TOKEN"])
-client = InferenceClient(provider="auto", api_key=os.environ["HF_TOKEN"])  # Automatic selection
-```
-
-</Tip>
+> [!TIP]
+> We're using the `fal-ai` provider with the `Qwen/Qwen-Image-Edit` model. The fal-ai provider offers fast inference times, perfect for interactive applications.
+>
+> However, you can experiment with different providers for various performance characteristics:
+>
+> ```python
+> client = InferenceClient(provider="replicate", api_key=os.environ["HF_TOKEN"])
+> client = InferenceClient(provider="auto", api_key=os.environ["HF_TOKEN"])  # Automatic selection
+> ```
 
 ## Step 4: Create the Gradio Interface
 
@@ -322,11 +310,8 @@ uv export --format requirements-txt --output-file requirements.txt
 
 This creates a `requirements.txt` file with all your project dependencies and their exact versions from the lockfile.
 
-<Tip>
-
-The `uv export` command ensures that your Space will use the exact same dependency versions that you tested locally, preventing deployment issues caused by version mismatches.
-
-</Tip>
+> [!TIP]
+> The `uv export` command ensures that your Space will use the exact same dependency versions that you tested locally, preventing deployment issues caused by version mismatches.
 
 Now you can deploy to Spaces:
 

diff --git a/docs/inference-providers/guides/structured-output.md b/docs/inference-providers/guides/structured-output.md
@@ -4,11 +4,8 @@ In this guide, we'll show you how to use Inference Providers to generate structu
 
 Structured outputs guarantee a model returns a response that matches your exact schema every time. This eliminates the need for complex parsing logic and makes your applications more robust.
 
-<Tip>
-
-This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co).
-
-</Tip>
+> [!TIP]
+> This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co).
 
 ## What Are Structured Outputs?
 
@@ -113,11 +110,8 @@ client = OpenAI(
 
 </hfoptions>
 
-<Tip>
-
-Structured outputs are a good use case for selecting a specific provider and model because you want to avoid incompatibility issues between the model, provider and the schema.
-
-</Tip>
+> [!TIP]
+> Structured outputs are a good use case for selecting a specific provider and model because you want to avoid incompatibility issues between the model, provider and the schema.
 
 ## Step 3: Generate structured output
 

diff --git a/docs/inference-providers/guides/vscode.md b/docs/inference-providers/guides/vscode.md
@@ -13,11 +13,8 @@ Use frontier open LLMs like Kimi K2, DeepSeek V3.1, GLM 4.5 and more in VS Code
 5. Enter your Hugging Face Token. You can get one from your [settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained).
 6. Choose the models you want to add to the model picker. 🥳
 
-<Tip>
-
-VS Code 1.104.0+ is required to install the HF Copilot Chat extension. If "Hugging Face" doesn't appear in the Copilot provider list, update VS Code, then reload.
-
-</Tip>
+> [!TIP]
+> VS Code 1.104.0+ is required to install the HF Copilot Chat extension. If "Hugging Face" doesn't appear in the Copilot provider list, update VS Code, then reload.
 
 ## ✨ Why use the Hugging Face provider in Copilot
 

diff --git a/docs/inference-providers/pricing.md b/docs/inference-providers/pricing.md
@@ -12,11 +12,8 @@ Every Hugging Face user receives monthly credits to experiment with Inference Pr
 | PRO Users                        | $2.00                    | yes                         |
 | Team or Enterprise Organizations | $2.00 per seat           | yes                         |
 
-<Tip>
-
-Your monthly credits automatically apply when you route requests through Hugging Face. For Team or Enterprise organizations, credits are shared among all members.
-
-</Tip>
+> [!TIP]
+> Your monthly credits automatically apply when you route requests through Hugging Face. For Team or Enterprise organizations, credits are shared among all members.
 
 ## How Billing Works: Choose Your Approach
 
@@ -44,11 +41,8 @@ See the [Organization Billing section](#organization-billing) below for more det
 **PRO users and Enterprise Hub organizations** can continue using the API after exhausting their monthly credits. This ensures uninterrupted access to models for production workloads.
 
 
-<Tip>
-
-Hugging Face charges you the same rates as the provider, with no additional fees. We just pass through the provider costs directly.
-
-</Tip>
+> [!TIP]
+> Hugging Face charges you the same rates as the provider, with no additional fees. We just pass through the provider costs directly.
 
 You can track your spending anytime on your [billing page](https://huggingface.co/settings/billing).
 
@@ -67,11 +61,8 @@ Here is a table that sums up what we've seen so far:
 | **Routed Requests**                 | Yes        | Hugging Face | Yes                | Only for PRO users and for integrated providers | SDKs, Playground, widgets, Data AI Studio |
 | **Custom Provider Key** | Yes        | Provider     | No                 | Yes                                             | SDKs, Playground, widgets, Data AI Studio |
 
-<Tip>
-
-You can set your custom provider key in the [settings page](https://huggingface.co/settings/inference-providers) on the Hub, or in the `InferenceClient` when using the JavaScript or Python SDKs. When making a routed request with a custom key, your code remains unchanged—you can still pass your Hugging Face User Access Token. Hugging Face will automatically swap the authentication when routing the request.
-
-</Tip>
+> [!TIP]
+> You can set your custom provider key in the [settings page](https://huggingface.co/settings/inference-providers) on the Hub, or in the `InferenceClient` when using the JavaScript or Python SDKs. When making a routed request with a custom key, your code remains unchanged—you can still pass your Hugging Face User Access Token. Hugging Face will automatically swap the authentication when routing the request.
 
 ## HF-Inference cost
 

diff --git a/docs/inference-providers/providers/cerebras.md b/docs/inference-providers/providers/cerebras.md
@@ -19,11 +19,8 @@ For more details, check out the `generate.ts` script: https://github.com/hugging
 
 # Cerebras
 
-<Tip>
-
-All supported Cerebras models can be found [here](https://huggingface.co/models?inference_provider=cerebras&sort=trending)
-
-</Tip>
+> [!TIP]
+> All supported Cerebras models can be found [here](https://huggingface.co/models?inference_provider=cerebras&sort=trending)
 
 <div class="flex justify-center">
     <a href="https://www.cerebras.ai/" target="_blank">