|
1 | 1 | # Hub API |
2 | 2 |
|
3 | | -**TODO** |
4 | | -- search for models (API): https://huggingface.co/api/models?inference_provider=fireworks-ai |
5 | | -- get warm models (any provider): https://huggingface.co/api/models?inference=warm |
6 | | -- get warm status for a model: https://huggingface.co/api/models/google/gemma-3-27b-it?expand[]=inference |
7 | | -- get providers for a model: https://huggingface.co/api/models/Qwen/QwQ-32B?expand[]=inferenceProviderMapping |
| 3 | +The Hub provides a few API to deal with Inference Providers. Here is a list of them. |
8 | 4 |
|
9 | | -Document raw HTTP requests + JS/Python equivalent in SDKs |
| 5 | +## List models |
| 6 | + |
| 7 | +To list models powered by a provider, use the `inference_provider` query parameter: |
| 8 | + |
| 9 | +```sh |
| 10 | +# List all models served by Fireworks AI |
| 11 | +~ curl -s https://huggingface.co/api/models?inference_provider=fireworks-ai | jq ".[].id" |
| 12 | +"deepseek-ai/DeepSeek-V3-0324" |
| 13 | +"deepseek-ai/DeepSeek-R1" |
| 14 | +"Qwen/QwQ-32B" |
| 15 | +"deepseek-ai/DeepSeek-V3" |
| 16 | +... |
| 17 | +``` |
| 18 | + |
| 19 | +It can be combined with other filters to e.g. select only text-to-image models: |
| 20 | + |
| 21 | +```sh |
| 22 | +# List text-to-image models served by Fal AI |
| 23 | +~ curl -s https://huggingface.co/api/models?inference_provider=fal-ai&pipeline_tag=text-to-image | jq ".[].id" |
| 24 | +"black-forest-labs/FLUX.1-dev" |
| 25 | +"stabilityai/stable-diffusion-3.5-large" |
| 26 | +"black-forest-labs/FLUX.1-schnell" |
| 27 | +"stabilityai/stable-diffusion-3.5-large-turbo" |
| 28 | +... |
| 29 | +``` |
| 30 | + |
| 31 | +Pass a comma-separated list to select from multiple providers: |
| 32 | + |
| 33 | +```sh |
| 34 | +# List image-text-to-text models served by Novita or Sambanova |
| 35 | +~ curl -s https://huggingface.co/api/models?inference_provider=sambanova,novita&pipeline_tag=image-text-to-text | jq ".[].id" |
| 36 | +"meta-llama/Llama-3.2-11B-Vision-Instruct" |
| 37 | +"meta-llama/Llama-3.2-90B-Vision-Instruct" |
| 38 | +"Qwen/Qwen2-VL-72B-Instruct" |
| 39 | +``` |
| 40 | + |
| 41 | +Finally, you can select all models served by at least one inference provider: |
| 42 | + |
| 43 | +```sh |
| 44 | +# List text-to-video models served by any provider |
| 45 | +~ curl -s https://huggingface.co/api/models?inference_provider=all&pipeline_tag=text-to-video | jq ".[].id" |
| 46 | +"Wan-AI/Wan2.1-T2V-14B" |
| 47 | +"Lightricks/LTX-Video" |
| 48 | +"tencent/HunyuanVideo" |
| 49 | +"Wan-AI/Wan2.1-T2V-1.3B" |
| 50 | +"THUDM/CogVideoX-5b" |
| 51 | +"genmo/mochi-1-preview" |
| 52 | +"BagOu22/Lora_HKLPAZ" |
| 53 | +``` |
| 54 | + |
| 55 | +## Get model status |
| 56 | + |
| 57 | +If you are interested by a specific model and want to check if at least 1 provider serves it, you can request the `inference` attribute in the model info endpoint: |
| 58 | + |
| 59 | +```sh |
| 60 | +# Get google/gemma-3-27b-it inference status (warm) |
| 61 | +~ curl -s https://huggingface.co/api/models/google/gemma-3-27b-it?expand[]=inference |
| 62 | +{ |
| 63 | +"_id": "67c35b9bb236f0d365bf29d3", |
| 64 | +"id": "google/gemma-3-27b-it", |
| 65 | +"inference": "warm" |
| 66 | +} |
| 67 | +``` |
| 68 | + |
| 69 | +Inference status is either "warm" or undefined: |
| 70 | + |
| 71 | +```sh |
| 72 | +# Get inference status (not warm) |
| 73 | +~ curl -s https://huggingface.co/api/models/manycore-research/SpatialLM-Llama-1B?expand[]=inference |
| 74 | +{ |
| 75 | +"_id": "67d3b141d8b6e20c6d009c8b", |
| 76 | +"id": "manycore-research/SpatialLM-Llama-1B" |
| 77 | +} |
| 78 | +``` |
| 79 | + |
| 80 | +## Get model providers |
| 81 | + |
| 82 | +If you are interested by a specific model and want to check the list of providers serving it, you can request the `inferenceProviderMapping` attribute in the model info endpoint: |
| 83 | + |
| 84 | +```sh |
| 85 | +# List google/gemma-3-27b-it providers |
| 86 | +~ curl -s https://huggingface.co/api/models/google/gemma-3-27b-it?expand[]=inferenceProviderMapping |
| 87 | +{ |
| 88 | + "_id": "67c35b9bb236f0d365bf29d3", |
| 89 | + "id": "google/gemma-3-27b-it", |
| 90 | + "inferenceProviderMapping": { |
| 91 | + "hf-inference": { |
| 92 | + "status": "live", |
| 93 | + "providerId": "google/gemma-3-27b-it", |
| 94 | + "task": "conversational" |
| 95 | + }, |
| 96 | + "nebius": { |
| 97 | + "status": "live", |
| 98 | + "providerId": "google/gemma-3-27b-it-fast", |
| 99 | + "task": "conversational" |
| 100 | + } |
| 101 | + } |
| 102 | +} |
| 103 | +``` |
| 104 | + |
| 105 | +For each provider, you get the status (`staging` or `live`), the related task (here, `conversational`) and the providerId. In practice, this information is mostly relevant for the JS and Python clients. The relevant part is to know that the listed providers are the ones serving the model. |
0 commit comments