You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/api-inference/hub-api.md
+68Lines changed: 68 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -56,6 +56,10 @@ Finally, you can select all models served by at least one inference provider:
56
56
57
57
If you are interested by a specific model and want to check if at least 1 provider serves it, you can request the `inference` attribute in the model info endpoint:
58
58
59
+
<inferencesnippet>
60
+
61
+
<curl>
62
+
59
63
```sh
60
64
# Get google/gemma-3-27b-it inference status (warm)
@@ -77,10 +102,32 @@ Inference status is either "warm" or undefined:
77
102
}
78
103
```
79
104
105
+
</curl>
106
+
107
+
<python>
108
+
109
+
In the `huggingface_hub`, use `model_info` with the expand parameter:
110
+
111
+
```py
112
+
>>>from huggingface_hub import model_info
113
+
114
+
>>> info = model_info("manycore-research/SpatialLM-Llama-1B", expand="inference")
115
+
>>> info.inference_provider_mapping
116
+
None
117
+
```
118
+
119
+
</python>
120
+
121
+
</inferencesnippet>
122
+
80
123
## Get model providers
81
124
82
125
If you are interested by a specific model and want to check the list of providers serving it, you can request the `inferenceProviderMapping` attribute in the model info endpoint:
For each provider, you get the status (`staging` or `live`), the related task (here, `conversational`) and the providerId. In practice, this information is mostly relevant for the JS and Python clients. The relevant part is to know that the listed providers are the ones serving the model.
0 commit comments