Skip to content

Commit a616323

Browse files
committed
inference providers documentation updates
1 parent e169c04 commit a616323

File tree

1 file changed

+56
-8
lines changed

1 file changed

+56
-8
lines changed

docs/inference-providers/register-as-a-provider.md

Lines changed: 56 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -154,14 +154,40 @@ Create a new mapping item, with the following body (JSON-encoded):
154154
- `hfModel` is the model id on the Hub's side.
155155
- `providerModel` is the model id on your side (can be the same or different).
156156

157-
In the future, we will add support for a new parameter (ping us if it's important to you now):
157+
The output of such route is a mapping ID that you can use later to update the mapping's status; or to delete it.
158+
159+
### Using a tag-filter to map several HF models to a single inference endpoint
160+
161+
We also support mapping HF models based on their `tags`.
162+
163+
This is useful to, for example, automatically map LoRA adapters to a single Inference Endpoint on your side.
164+
165+
The API is as follows:
166+
167+
```http
168+
POST /api/partners/{provider}/models
169+
```
170+
Create a new mapping item, with the following body (JSON-encoded):
171+
158172
```json
159173
{
160-
"hfFilter": ["string"]
161-
// ^Power user move: register a "tag" slice of HF in one go.
162-
// Example: tag == "base_model:adapter:black-forest-labs/FLUX.1-dev" for all Flux-dev LoRAs
174+
"type": "tag-filter", // required
175+
"task": "WidgetType", // required
176+
"tags": ["string"], // required: any HF model with all of those tags will be mapped to providerModel
177+
"providerModel": "string", // required: the partner's "model id" i.e. id on your side
178+
"adapterType": "lora", // required: only "lora" is supported at the moment
179+
"status": "live" | "staging" // Optional: defaults to "staging". "staging" models are only available to members of the partner's org, then you switch them to "live" when they're ready to go live
163180
}
164181
```
182+
183+
- `task`, also known as `pipeline_tag` in the HF ecosystem, is the type of model / type of API
184+
(examples: "text-to-image", "text-generation", but you should use "conversational" for chat models)
185+
- `tags` is the set of model tags to match. For example, to match all LoRAs of Flux, you can use: `["lora", "base_model:adapter:black-forest-labs/FLUX.1-dev"]`
186+
- `providerModel` is the model id on your side (can be the same or different).
187+
- `adapterType` is a literal value designed to help client libraries interpret how to request your API. The only supported value at the moment is `"lora"`.
188+
189+
The output of such route is a mapping ID that you can use later to update the mapping's status; or to delete it.
190+
165191
#### Authentication
166192

167193
You need to be in the _provider_ Hub organization (e.g. https://huggingface.co/togethercomputer
@@ -178,26 +204,31 @@ huggingface.js/inference call of the corresponding task i.e. the API specs are v
178204
### Delete a mapping item
179205

180206
```http
181-
DELETE /api/partners/{provider}/models?hfModel=namespace/model-name
207+
DELETE /api/partners/{provider}/models/{mapping ID}
182208
```
183209

210+
Where `mapping ID` is the mapping's id obtained upon creation.
211+
You can also retrieve it from the [list API endpoint](#list-the-whole-mapping).
212+
184213
### Update a mapping item's status
185214

186215
Call this HTTP PUT endpoint:
187216

188217
```http
189-
PUT /api/partners/{provider}/models/status
218+
PUT /api/partners/{provider}/models/{mapping ID}/status
190219
```
191220

192221
With the following body (JSON-encoded):
193222

194223
```json
195224
{
196-
"hfModel": "namespace/model-name", // The name of the model on HF
197225
"status": "live" | "staging" // The new status, one of "staging" or "live"
198226
}
199227
```
200228

229+
Where `mapping ID` is the mapping's id obtained upon creation.
230+
You can also retrieve it from the [list API endpoint](#list-the-whole-mapping).
231+
201232
### List the whole mapping
202233

203234
```http
@@ -217,26 +248,41 @@ Here is an example of response:
217248
{
218249
"text-to-image": {
219250
"black-forest-labs/FLUX.1-Canny-dev": {
251+
"_id": "xxxxxxxxxxxxxxxxxxxxxxxx",
220252
"providerId": "black-forest-labs/FLUX.1-canny",
221253
"status": "live"
222254
},
223255
"black-forest-labs/FLUX.1-Depth-dev": {
256+
"_id": "xxxxxxxxxxxxxxxxxxxxxxxx",
224257
"providerId": "black-forest-labs/FLUX.1-depth",
225258
"status": "live"
259+
},
260+
"tag-filter=base_model:adapter:stabilityai/stable-diffusion-xl-base-1.0,lora": {
261+
"_id": "xxxxxxxxxxxxxxxxxxxxxxxx",
262+
"status": "live",
263+
"providerId": "sdxl-lora-mutualized",
264+
"adapterType": "lora",
265+
"tags": [
266+
"base_model:adapter:stabilityai/stable-diffusion-xl-base-1.0",
267+
"lora"
268+
]
226269
}
227270
},
228271
"conversational": {
229272
"deepseek-ai/DeepSeek-R1": {
273+
"_id": "xxxxxxxxxxxxxxxxxxxxxxxx",
230274
"providerId": "deepseek-ai/DeepSeek-R1",
231275
"status": "live"
232276
}
233277
},
234278
"text-generation": {
235279
"meta-llama/Llama-2-70b-hf": {
280+
"_id": "xxxxxxxxxxxxxxxxxxxxxxxx",
236281
"providerId": "meta-llama/Llama-2-70b-hf",
237282
"status": "live"
238283
},
239284
"mistralai/Mixtral-8x7B-v0.1": {
285+
"_id": "xxxxxxxxxxxxxxxxxxxxxxxx",
240286
"providerId": "mistralai/Mixtral-8x7B-v0.1",
241287
"status": "live"
242288
}
@@ -264,9 +310,11 @@ provide the cost for each request via an HTTP API you host on your end.
264310
We ask that you expose an API that supports a HTTP POST request.
265311
The body of the request is a JSON-encoded object containing a list of request IDs for which we
266312
request the cost.
313+
The authentication system should be the same as your Inference service; for example, a bearer token.
267314

268315
```http
269316
POST {your URL here}
317+
Authorization: {authentication info - eg "Bearer token"}
270318
Content-Type: application/json
271319
272320
{
@@ -297,7 +345,7 @@ Content-Type: application/json
297345

298346
### Price Unit
299347

300-
We require the price to be an **integer** number of **nano-USDs** (10^-9 USD).
348+
We require the price to be a **non-negative integer** number of **nano-USDs** (10^-9 USD).
301349

302350
### How to define the request ID
303351

0 commit comments

Comments
 (0)