You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/inference-providers/register-as-a-provider.md
+42-60Lines changed: 42 additions & 60 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,14 +1,28 @@
1
1
# How to be registered as an inference provider on the Hub?
2
2
3
-
If you'd like to be an inference provider on the Hub, you must follow the steps outlined in this guide.
3
+
<Tip>
4
+
5
+
Want to be an Inference Provider on the Hub? Please reach out to us!
6
+
7
+
</Tip>
8
+
9
+
This guide details each of the following steps and provides implementation guidance.
10
+
11
+
1.**Implement standard task APIs** - Follow our task API schemas for compatibility (see [Prerequisites](#1-prerequisites))
12
+
2.**Submit a PR for JS client integration** - Add your provider to [huggingface.js](https://github.com/huggingface/huggingface.js/tree/main/packages/inference) (see [JS Client Integration](#2-js-client-integration))
13
+
3.**Register model mappings** - Use our Model Mapping API to connect your models to Hub models (see [Model Mapping API](#3-model-mapping-api))
14
+
4.**Implement a billing endpoint** - Provide an API for billing (see [Billing](#4-billing))
15
+
5.**Submit a PR for Python client integration** - Add your provider to [huggingface_hub](https://github.com/huggingface/huggingface_hub) (see [Python client integration](#5-python-client-integration))
16
+
6.**Provide an icon** - Submit an SVG icon for your provider.
17
+
7.**Create documentation** - Add documentation and do some communication on your side.
18
+
8.**Add a documentation page** - Add a provider-specific page in the Hub documentation.
4
19
5
-
## 1. Requirements
20
+
21
+
## 1. Prerequisites
6
22
7
23
<Tip>
8
24
9
-
If you provide inference only for LLMs and VLMs following the OpenAI API, you
10
-
can probably skip most of this section and just open a PR on
11
-
https://github.com/huggingface/huggingface.js/tree/main/packages/inference to add you as a provider.
25
+
If your implementation strictly follows the OpenAI API for LLMs and VLMs, you may be able to skip most of this section. In that case, simply open a PR on [huggingface.js](https://github.com/huggingface/huggingface.js/tree/main/packages/inference) to register.
This is the client that powers our Inference widgets on model pages, and is the blueprint
20
-
implementation for other downstream SDKs like Python's `huggingface-hub` and other tools.
34
+
implementation downstream (for Python SDK, to generate code snippets, etc.).
35
+
21
36
22
37
### What is a Task
23
38
@@ -43,6 +58,7 @@ which are tagged as "conversational".
43
58
44
59
</Tip>
45
60
61
+
46
62
### Task API schema
47
63
48
64
For each task type, we enforce an API schema to make it easier for end users to use different
@@ -58,9 +74,9 @@ For example, you can find the expected schema for Text to Speech here: [https://
58
74
59
75
Before proceeding with the next steps, ensure you've implemented the necessary code to integrate with the JS client and thoroughly tested your implementation. Here are the steps to follow:
60
76
61
-
### 1. Implement the provider helper
77
+
### Implement the provider helper
62
78
63
-
Create a new file under packages/inference/src/providers/{provider_name}.ts and copy-paste the following snippet.
79
+
Create a new file under `packages/inference/src/providers/{provider_name}.ts` and copy-paste the following snippet.
@@ -97,27 +113,24 @@ Implement the methods that require custom handling. Check out the base implement
97
113
98
114
If the provider supports multiple tasks that require different implementations, create dedicated subclasses for each task, following the pattern used in the existing providers implementation, e.g. [Together AI provider implementation](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/together.ts).
99
115
100
-
For text-generation and conversational tasks, one can just inherit from BaseTextGenerationTask and BaseConversationalTask respectively (defined in [providerHelper.ts]((https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/providerHelper.ts))) and override the methods if needed. Examples can be found in [Cerebras](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/cerebras.ts) or [Fireworks](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/fireworks.ts) provider implementations.
116
+
For text-generation and conversational tasks, you can just inherit from BaseTextGenerationTask and BaseConversationalTask respectively (defined in [providerHelper.ts]((https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/providerHelper.ts))) and override the methods if needed. Examples can be found in [Cerebras](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/cerebras.ts) or [Fireworks](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/fireworks.ts) provider implementations.
101
117
102
-
### 2. Register the provider
118
+
### Register the provider
103
119
104
120
Go to [packages/inference/src/lib/getProviderHelper.ts](https://github.com/huggingface/huggingface.js//blob/main/packages/inference/src/lib/getProviderHelper.ts) and add your provider to `PROVIDERS`. Please try to respect alphabetical order.
105
121
106
-
### 3. Add tests
107
122
108
-
Go to [packages/inference/test/InferenceClient.spec.ts](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/test/InferenceClient.spec.ts) and add new tests for each task supported by your provider.
123
+
## 3. Model Mapping API
109
124
125
+
Congratulations! You now have a JS implementation to successfully make inference calls on your infra! Time to integrate with the Hub!
110
126
111
-
## 3. Model Mapping API
127
+
First step is to use the Model Mapping API to register which HF models are supported.
112
128
113
-
Once you've verified the huggingface.js/inference client can call your models successfully, you
114
-
can use our Model Mapping API.
129
+
<Tip>
115
130
116
-
This API lets a Partner "register" that they support model X, Y or Z on HF.
131
+
To proceed with this step, we have to enable your account server-side. Make sure you have an organization on the Hub for your enterprise.
117
132
118
-
This enables:
119
-
- The inference widget on corresponding model pages
120
-
- Inference compatibility throughout the HF ecosystem (Python and JS client SDKs for instance), and any downstream tool or library
133
+
</Tip>
121
134
122
135
### Register a mapping item
123
136
@@ -230,17 +243,16 @@ Here is an example of response:
230
243
231
244
For routed requests (see figure below), i.e. when users authenticate via HF, our intent is that
232
245
our users only pay the standard provider API rates. There's no additional markup from us, we
233
-
just pass through the provider costs directly.
246
+
just pass through the provider costs directly.
247
+
More details about the pricing structure can be found on the [pricing page](./pricing.md).
For LLM providers, a workaround some people use is to extract numbers of input and output
240
-
tokens in the responses and multiply by a hardcoded pricing table – this is quite brittle, so we
241
-
are reluctant to do this.
253
+
242
254
We propose an easier way to figure out this cost and charge it to our users, by asking you to
243
-
provide the cost for each request via an HTTP API you host on your end.
255
+
provide the cost for each request via an HTTP API you host on your end.
244
256
245
257
### HTTP API Specs
246
258
@@ -313,8 +325,8 @@ Before adding a new provider to the `huggingface_hub` Python library, make sure
313
325
314
326
</Tip>
315
327
316
-
### 1. Implement the provider helper
317
-
Create a new file under src/huggingface_hub/inference/_providers/{provider_name}.py and copy-paste the following snippet.
328
+
### Implement the provider helper
329
+
Create a new file under `src/huggingface_hub/inference/_providers/{provider_name}.py` and copy-paste the following snippet.
318
330
319
331
Implement the methods that require custom handling. Check out the base implementation to check default behavior. If you don't need to override a method, just remove it. At least one of `_prepare_payload_as_dict` or `_prepare_payload_as_bytes` must be overwritten.
320
332
@@ -377,42 +389,12 @@ class MyNewProviderTaskProviderHelper(TaskProviderHelper):
- Go to [src/huggingface_hub/inference/_providers/__init__.py](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/inference/_providers/__init__.py) and add your provider to `PROVIDER_T` and `PROVIDERS`. Please try to respect alphabetical order.
382
394
- Go to [src/huggingface_hub/inference/_client.py](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/inference/_client.py) and update docstring in `InferenceClient.__init__` to document your provider.
383
395
384
-
### 3. Add tests
396
+
### Add tests
385
397
- Go to [tests/test_inference_providers.py](https://github.com/huggingface/huggingface_hub/blob/main/tests/test_inference_providers.py) and add static tests for overridden methods.
386
-
- Go to [tests/test_inference_client.py](https://github.com/huggingface/huggingface_hub/blob/main/tests/test_inference_client.py) and add VCR tests:
387
-
388
-
389
-
a. Add an entry to `_RECOMMENDED_MODELS_FOR_VCR` at the top of the test module. This contains a mapping task <> test model. model-id must be the HF model id.
390
-
```python
391
-
_RECOMMENDED_MODELS_FOR_VCR= {
392
-
"your-provider": {
393
-
"task": "model-id",
394
-
...
395
-
},
396
-
...
397
-
}
398
-
```
399
-
400
-
b. Set up authentication: To record VCR cassettes, you'll need authentication:
401
-
If you are a member of the provider organization (e.g., Replicate organization: https://huggingface.co/replicate), you can set the HF_INFERENCE_TEST_TOKEN environment variable with your HF token:
402
-
403
-
```bash
404
-
export HF_INFERENCE_TEST_TOKEN="your-hf-token"
405
-
```
406
-
407
-
If you're not a member but the provider is officially released on the Hub, you can set the HF_INFERENCE_TEST_TOKEN environment variable as above. If you don't have enough inference credits, we can help you record the VCR cassettes.
d. Commit the generated VCR cassettes with your PR.
416
398
417
399
418
400
## FAQ
@@ -421,4 +403,4 @@ class MyNewProviderTaskProviderHelper(TaskProviderHelper):
421
403
422
404
**Answer:** The default sort is by total number of requests routed by HF over the last 7 days. This order defines which provider will be used in priority by the widget on the model page (but the user's order takes precedence).
0 commit comments