Skip to content

Commit 397ad04

Browse files
committed
Merge branch 'update-inference-providers-docs-automated-pr' of github.com:huggingface/hub-docs into update-inference-providers-docs-automated-pr
2 parents c1085cf + d7b8cd4 commit 397ad04

33 files changed

+445
-29
lines changed

docs/hub/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -398,6 +398,8 @@
398398
title: "Protect AI"
399399
- local: security-jfrog
400400
title: "JFrog"
401+
- local: agents
402+
title: Agents on Hub
401403
- local: moderation
402404
title: Moderation
403405
- local: paper-pages

docs/hub/agents.md

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# Agents on the Hub
2+
3+
This page compiles all the libraries and tools Hugging Face offers for agentic workflows: huggingface.js mcp-client, Gradio MCP Server and smolagents.
4+
5+
## smolagents
6+
7+
[smolagents](https://github.com/huggingface/smolagents) is a lightweight library to cover all agentic use cases, from code-writing agents to computer use, in few lines of code. It is model agnostic, supporting local models served with Hugging Face Transformers, as well as models offered with [Inference Providers](../inference-providers/index.md), and proprietary model providers.
8+
9+
It offers a unique kind of agent :`CodeAgent`, an agent that writes its actions in Python code.
10+
It also supports the standard agent that writes actions in JSON blobs as most other agentic frameworks do, called `ToolCallingAgent`.
11+
To learn more about write actions in code vs JSON, check out our [new short course on DeepLearning.AI](https://www.deeplearning.ai/short-courses/building-code-agents-with-hugging-face-smolagents/).
12+
13+
If you want to avoid defining agents yourself, the easiest way to start an agent is through the CLI, using the `smolagent` command.
14+
15+
```bash
16+
smolagent "Plan a trip to Tokyo, Kyoto and Osaka between Mar 28 and Apr 7." \
17+
--model-type "InferenceClientModel" \
18+
--model-id "Qwen/Qwen2.5-Coder-32B-Instruct" \
19+
--imports "pandas numpy" \
20+
--tools "web_search"
21+
```
22+
23+
Agents can be pushed to Hugging Face Hub as Spaces. Check out all the cool agents people have built [here](https://huggingface.co/spaces?filter=smolagents&sort=likes).
24+
25+
smolagents also supports MCP servers as tools, as follows:
26+
27+
```python
28+
# pip install --upgrade smolagents mcp
29+
from smolagents import MCPClient, CodeAgent
30+
from mcp import StdioServerParameters
31+
import os
32+
33+
server_parameters = StdioServerParameters(
34+
command="uvx", # Using uvx ensures dependencies are available
35+
args=["--quiet", "[email protected]"],
36+
env={"UV_PYTHON": "3.12", **os.environ},
37+
)
38+
39+
with MCPClient(server_parameters) as tools:
40+
agent = CodeAgent(tools=tools, model=model, add_base_tools=True)
41+
agent.run("Please find the latest research on COVID-19 treatment.")
42+
```
43+
44+
Learn more [in the documentation](https://huggingface.co/docs/smolagents/tutorials/tools#use-mcp-tools-with-mcpclient-directly).
45+
46+
## huggingface.js mcp-client
47+
48+
Huggingface.js offers an MCP client served with [Inference Providers](https://huggingface.co/docs/inference-providers/en/index) or local LLMs. Getting started with them is as simple as running `pnpm agent`. You can plug and play different models and providers by setting `PROVIDER` and `MODEL_ID` environment variables.
49+
50+
```bash
51+
export HF_TOKEN="hf_..."
52+
export MODEL_ID="Qwen/Qwen2.5-72B-Instruct"
53+
export PROVIDER="nebius"
54+
npx @huggingface/mcp-client
55+
```
56+
57+
or, you can use any Local LLM (for example via lmstudio):
58+
59+
```bash
60+
ENDPOINT_URL=http://localhost:1234/v1 \
61+
MODEL_ID=lmstudio-community/Qwen3-14B-GGUF \
62+
npx @huggingface/mcp-client
63+
```
64+
65+
You can get more information about mcp-client [here](https://huggingface.co/docs/huggingface.js/en/mcp-client/README).
66+
67+
68+
## Gradio MCP Server / Tools
69+
70+
You can build an MCP server in just a few lines of Python with Gradio. If you have an existing Gradio app or Space you'd like to use as an MCP server / tool, it's just a single-line change.
71+
72+
To make a Gradio application an MCP server, simply pass in `mcp_server=True` when launching your demo like follows.
73+
74+
```python
75+
# pip install gradio
76+
77+
import gradio as gr
78+
79+
def generate_image(prompt: str):
80+
"""
81+
Generate an image based on a text prompt
82+
83+
Args:
84+
prompt: a text string describing the image to generate
85+
"""
86+
pass
87+
88+
demo = gr.Interface(
89+
fn=generate_image,
90+
inputs="text",
91+
outputs="image",
92+
title="Image Generator"
93+
)
94+
95+
demo.launch(mcp_server=True)
96+
```
97+
98+
The MCP server will be available at `http://your-space-id.hf.space/gradio_api/mcp/sse` where your application is served. It will have a tool corresponding to each function in your Gradio app, with the tool description automatically generated from the docstrings of your functions.
99+
100+
Lastly, add this to the settings of the MCP Client of your choice (e.g. Cursor).
101+
102+
```json
103+
{
104+
"mcpServers": {
105+
"gradio": {
106+
"url": "http://your-server:port/gradio_api/mcp/sse"
107+
}
108+
}
109+
}
110+
```
111+
112+
This is very powerful because it lets the LLM use any Gradio application as a tool. You can find thousands of them on [Spaces](https://huggingface.co/spaces). Learn more [here](https://www.gradio.app/guides/building-mcp-server-with-gradio).
113+

docs/hub/datasets-adding.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ The Hub natively supports multiple file formats:
8585
- Text (.txt)
8686
- Images (.png, .jpg, etc.)
8787
- Audio (.wav, .mp3, etc.)
88+
- PDF (.pdf)
8889
- [WebDataset](https://github.com/webdataset/webdataset) (.tar)
8990

9091
It supports files compressed using ZIP (.zip), GZIP (.gz), ZSTD (.zst), BZ2 (.bz2), LZ4 (.lz4) and LZMA (.xz).

docs/hub/datasets-downloading.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,15 @@ If a dataset on the Hub is tied to a [supported library](./datasets-libraries),
1616

1717
## Using the Hugging Face Client Library
1818

19-
You can use the [`huggingface_hub`](/docs/huggingface_hub) library to create, delete, update and retrieve information from repos. You can also download files from repos or integrate them into your library! For example, you can quickly load a CSV dataset with a few lines using Pandas.
19+
You can use the [`huggingface_hub`](/docs/huggingface_hub) library to create, delete, update and retrieve information from repos. For example, to download the `HuggingFaceH4/ultrachat_200k` dataset from the command line, run
2020

21+
```bash
22+
huggingface-cli download HuggingFaceH4/ultrachat_200k --repo-type dataset
23+
```
24+
25+
See the [huggingface-cli download documentation](https://huggingface.co/docs/huggingface_hub/en/guides/cli#download-a-dataset-or-a-space) for more information.
26+
27+
You can also integrate this into your own library! For example, you can quickly load a CSV dataset with a few lines using Pandas.
2128
```py
2229
from huggingface_hub import hf_hub_download
2330
import pandas as pd

docs/hub/model-release-checklist.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,13 +63,16 @@ We wrote an extensive guide on uploading best practices [here](https://huggingfa
6363

6464
Bonus: a recognised library also allows you to track downloads of your model over time.
6565

66-
2. **Pipeline Tag Selection**: Choose the correct [pipeline tag](https://huggingface.co/docs/hub/model-cards#specifying-a-task--pipelinetag-) that accurately reflects your model's primary task. This tag determines how your model appears in search results and which widgets are displayed on your model page.
66+
2. **Correct Metadata**:
67+
- **Pipeline Tag:** Choose the correct [pipeline tag](https://huggingface.co/docs/hub/model-cards#specifying-a-task--pipelinetag-) that accurately reflects your model's primary task. This tag determines how your model appears in search results and which widgets are displayed on your model page.
6768

6869
Examples of common pipeline tags:
6970
- `text-generation` - For language models that generate text
7071
- `text-to-image` - For text-to-image generation models
7172
- `image-text-to-text` - For vision-language models (VLMs) that generate text
7273
- `text-to-speech` - For models that generate audio from text
74+
75+
- **License:** License information is crucial for users to understand how they can use the model.
7376

7477
3. **Research Papers**: If your model has associated research papers, you can cite them in your model card and they will be [linked automatically](https://huggingface.co/docs/hub/model-cards#linking-a-paper). This provides academic context, allows users to dive deeper into the theoretical foundations of your work, and increases citations.
7578

@@ -88,6 +91,8 @@ Bonus: a recognised library also allows you to track downloads of your model ove
8891

8992
Try this model directly in your browser: [Space Demo](https://huggingface.co/spaces/username/model-demo)
9093
```
94+
95+
When you create a demo, please download the model from its repository on the Hub (instead of using external sources like Google Drive); it cross-links model artefacts and demo together and allows more paths to visibility.
9196

9297
6. **Quantized Versions**: Consider uploading quantized versions of your model (e.g., in GGUF or DDUF formats) to improve accessibility for users with limited computational resources. Link these versions using the [`base_model` metadata field](https://huggingface.co/docs/hub/model-cards#specifying-a-base-model) on the quantized model cards. You can also clearly document performance differences between the original and quantized versions.
9398

docs/hub/spaces-zerogpu.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
<img src="https://cdn-uploads.huggingface.co/production/uploads/5f17f0a0925b9863e28ad517/naVZI-v41zNxmGlhEhGDJ.gif" style="max-width: 440px; width: 100%" alt="ZeroGPU schema" />
44

5-
ZeroGPU is a shared infrastructure that optimizes GPU usage for AI models and demos on Hugging Face Spaces. It dynamically allocates and releases NVIDIA A100 GPUs as needed, offering:
5+
ZeroGPU is a shared infrastructure that optimizes GPU usage for AI models and demos on Hugging Face Spaces. It dynamically allocates and releases NVIDIA H200 GPUs as needed, offering:
66

77
1. **Free GPU Access**: Enables cost-effective GPU usage for Spaces.
88
2. **Multi-GPU Support**: Allows Spaces to leverage multiple GPUs concurrently on a single application.

docs/inference-providers/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@
2929
title: Nebius
3030
- local: providers/novita
3131
title: Novita
32+
- local: providers/nscale
33+
title: Nscale
3234
- local: providers/replicate
3335
title: Replicate
3436
- local: providers/sambanova

docs/inference-providers/index.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ Here is the complete list of partners integrated with Inference Providers, and t
2323
| [Hyperbolic](./providers/hyperbolic) ||| | | |
2424
| [Nebius](./providers/nebius) ||| || |
2525
| [Novita](./providers/novita) ||| | ||
26+
| [Nscale](./providers/nscale) ||| || |
2627
| [Replicate](./providers/replicate) | | | |||
2728
| [SambaNova](./providers/sambanova) || || | |
2829
| [Together](./providers/together) ||| || |
@@ -59,6 +60,10 @@ You can use Inference Providers with your preferred tools, such as Python, JavaS
5960

6061
In this section, we will demonstrate a simple example using [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324), a conversational Large Language Model. For the example, we will use [Novita AI](https://novita.ai/) as Inference Provider.
6162

63+
> [!TIP]
64+
> You can also automatically select a provider for a model using `provider="auto"` — it will pick the first available provider for your model based on your preferred order set in https://hf.co/settings/inference-providers.
65+
> This is the default if you don't specify a provider in our Python or JavaScript SDK.
66+
6267
### Authentication
6368

6469
Inference Providers requires passing a user token in the request headers. You can generate a token by signing up on the Hugging Face website and going to the [settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). We recommend creating a `fine-grained` token with the scope to `Make calls to Inference Providers`.

docs/inference-providers/providers/cohere.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,6 @@ Find out more about Chat Completion (VLM) [here](../tasks/chat-completion).
5656

5757
<InferenceSnippet
5858
pipeline=image-text-to-text
59-
providersMapping={ {"cohere":{"modelId":"CohereLabs/aya-vision-8b","providerModelId":"c4ai-aya-vision-8b"} } }
59+
providersMapping={ {"cohere":{"modelId":"CohereLabs/aya-vision-32b","providerModelId":"c4ai-aya-vision-32b"} } }
6060
conversational />
6161

docs/inference-providers/providers/fal-ai.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,6 @@ Find out more about Text To Video [here](../tasks/text_to_video).
6464

6565
<InferenceSnippet
6666
pipeline=text-to-video
67-
providersMapping={ {"fal-ai":{"modelId":"Wan-AI/Wan2.1-T2V-14B","providerModelId":"fal-ai/wan-t2v"} } }
67+
providersMapping={ {"fal-ai":{"modelId":"Lightricks/LTX-Video","providerModelId":"fal-ai/ltx-video"} } }
6868
/>
6969

0 commit comments

Comments
 (0)