Skip to content

Commit bf55fb1

Browse files
committed
Hub API page
1 parent fa8efb6 commit bf55fb1

File tree

1 file changed

+102
-6
lines changed

1 file changed

+102
-6
lines changed

docs/api-inference/hub-api.md

Lines changed: 102 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,105 @@
11
# Hub API
22

3-
**TODO**
4-
- search for models (API): https://huggingface.co/api/models?inference_provider=fireworks-ai
5-
- get warm models (any provider): https://huggingface.co/api/models?inference=warm
6-
- get warm status for a model: https://huggingface.co/api/models/google/gemma-3-27b-it?expand[]=inference
7-
- get providers for a model: https://huggingface.co/api/models/Qwen/QwQ-32B?expand[]=inferenceProviderMapping
3+
The Hub provides a few API to deal with Inference Providers. Here is a list of them.
84

9-
Document raw HTTP requests + JS/Python equivalent in SDKs
5+
## List models
6+
7+
To list models powered by a provider, use the `inference_provider` query parameter:
8+
9+
```sh
10+
# List all models served by Fireworks AI
11+
~ curl -s https://huggingface.co/api/models?inference_provider=fireworks-ai | jq ".[].id"
12+
"deepseek-ai/DeepSeek-V3-0324"
13+
"deepseek-ai/DeepSeek-R1"
14+
"Qwen/QwQ-32B"
15+
"deepseek-ai/DeepSeek-V3"
16+
...
17+
```
18+
19+
It can be combined with other filters to e.g. select only text-to-image models:
20+
21+
```sh
22+
# List text-to-image models served by Fal AI
23+
~ curl -s https://huggingface.co/api/models?inference_provider=fal-ai&pipeline_tag=text-to-image | jq ".[].id"
24+
"black-forest-labs/FLUX.1-dev"
25+
"stabilityai/stable-diffusion-3.5-large"
26+
"black-forest-labs/FLUX.1-schnell"
27+
"stabilityai/stable-diffusion-3.5-large-turbo"
28+
...
29+
```
30+
31+
Pass a comma-separated list to select from multiple providers:
32+
33+
```sh
34+
# List image-text-to-text models served by Novita or Sambanova
35+
~ curl -s https://huggingface.co/api/models?inference_provider=sambanova,novita&pipeline_tag=image-text-to-text | jq ".[].id"
36+
"meta-llama/Llama-3.2-11B-Vision-Instruct"
37+
"meta-llama/Llama-3.2-90B-Vision-Instruct"
38+
"Qwen/Qwen2-VL-72B-Instruct"
39+
```
40+
41+
Finally, you can select all models served by at least one inference provider:
42+
43+
```sh
44+
# List text-to-video models served by any provider
45+
~ curl -s https://huggingface.co/api/models?inference_provider=all&pipeline_tag=text-to-video | jq ".[].id"
46+
"Wan-AI/Wan2.1-T2V-14B"
47+
"Lightricks/LTX-Video"
48+
"tencent/HunyuanVideo"
49+
"Wan-AI/Wan2.1-T2V-1.3B"
50+
"THUDM/CogVideoX-5b"
51+
"genmo/mochi-1-preview"
52+
"BagOu22/Lora_HKLPAZ"
53+
```
54+
55+
## Get model status
56+
57+
If you are interested by a specific model and want to check if at least 1 provider serves it, you can request the `inference` attribute in the model info endpoint:
58+
59+
```sh
60+
# Get google/gemma-3-27b-it inference status (warm)
61+
~ curl -s https://huggingface.co/api/models/google/gemma-3-27b-it?expand[]=inference
62+
{
63+
"_id": "67c35b9bb236f0d365bf29d3",
64+
"id": "google/gemma-3-27b-it",
65+
"inference": "warm"
66+
}
67+
```
68+
69+
Inference status is either "warm" or undefined:
70+
71+
```sh
72+
# Get inference status (not warm)
73+
~ curl -s https://huggingface.co/api/models/manycore-research/SpatialLM-Llama-1B?expand[]=inference
74+
{
75+
"_id": "67d3b141d8b6e20c6d009c8b",
76+
"id": "manycore-research/SpatialLM-Llama-1B"
77+
}
78+
```
79+
80+
## Get model providers
81+
82+
If you are interested by a specific model and want to check the list of providers serving it, you can request the `inferenceProviderMapping` attribute in the model info endpoint:
83+
84+
```sh
85+
# List google/gemma-3-27b-it providers
86+
~ curl -s https://huggingface.co/api/models/google/gemma-3-27b-it?expand[]=inferenceProviderMapping
87+
{
88+
"_id": "67c35b9bb236f0d365bf29d3",
89+
"id": "google/gemma-3-27b-it",
90+
"inferenceProviderMapping": {
91+
"hf-inference": {
92+
"status": "live",
93+
"providerId": "google/gemma-3-27b-it",
94+
"task": "conversational"
95+
},
96+
"nebius": {
97+
"status": "live",
98+
"providerId": "google/gemma-3-27b-it-fast",
99+
"task": "conversational"
100+
}
101+
}
102+
}
103+
```
104+
105+
For each provider, you get the status (`staging` or `live`), the related task (here, `conversational`) and the providerId. In practice, this information is mostly relevant for the JS and Python clients. The relevant part is to know that the listed providers are the ones serving the model.

0 commit comments

Comments
 (0)