Skip to content

Commit f9d4359

Browse files
committed
review suggestions
1 parent 6e38360 commit f9d4359

File tree

1 file changed

+42
-60
lines changed

1 file changed

+42
-60
lines changed

docs/inference-providers/register-as-a-provider.md

Lines changed: 42 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,28 @@
11
# How to be registered as an inference provider on the Hub?
22

3-
If you'd like to be an inference provider on the Hub, you must follow the steps outlined in this guide.
3+
<Tip>
4+
5+
Want to be an Inference Provider on the Hub? Please reach out to us!
6+
7+
</Tip>
8+
9+
This guide details each of the following steps and provides implementation guidance.
10+
11+
1. **Implement standard task APIs** - Follow our task API schemas for compatibility (see [Prerequisites](#1-prerequisites))
12+
2. **Submit a PR for JS client integration** - Add your provider to [huggingface.js](https://github.com/huggingface/huggingface.js/tree/main/packages/inference) (see [JS Client Integration](#2-js-client-integration))
13+
3. **Register model mappings** - Use our Model Mapping API to connect your models to Hub models (see [Model Mapping API](#3-model-mapping-api))
14+
4. **Implement a billing endpoint** - Provide an API for billing (see [Billing](#4-billing))
15+
5. **Submit a PR for Python client integration** - Add your provider to [huggingface_hub](https://github.com/huggingface/huggingface_hub) (see [Python client integration](#5-python-client-integration))
16+
6. **Provide an icon** - Submit an SVG icon for your provider.
17+
7. **Create documentation** - Add documentation and do some communication on your side.
18+
8. **Add a documentation page** - Add a provider-specific page in the Hub documentation.
419

5-
## 1. Requirements
20+
21+
## 1. Prerequisites
622

723
<Tip>
824

9-
If you provide inference only for LLMs and VLMs following the OpenAI API, you
10-
can probably skip most of this section and just open a PR on
11-
https://github.com/huggingface/huggingface.js/tree/main/packages/inference to add you as a provider.
25+
If your implementation strictly follows the OpenAI API for LLMs and VLMs, you may be able to skip most of this section. In that case, simply open a PR on [huggingface.js](https://github.com/huggingface/huggingface.js/tree/main/packages/inference) to register.
1226

1327
</Tip>
1428

@@ -17,7 +31,8 @@ inside the huggingface.js repo:
1731
https://github.com/huggingface/huggingface.js/tree/main/packages/inference
1832

1933
This is the client that powers our Inference widgets on model pages, and is the blueprint
20-
implementation for other downstream SDKs like Python's `huggingface-hub` and other tools.
34+
implementation downstream (for Python SDK, to generate code snippets, etc.).
35+
2136

2237
### What is a Task
2338

@@ -43,6 +58,7 @@ which are tagged as "conversational".
4358

4459
</Tip>
4560

61+
4662
### Task API schema
4763

4864
For each task type, we enforce an API schema to make it easier for end users to use different
@@ -58,9 +74,9 @@ For example, you can find the expected schema for Text to Speech here: [https://
5874

5975
Before proceeding with the next steps, ensure you've implemented the necessary code to integrate with the JS client and thoroughly tested your implementation. Here are the steps to follow:
6076

61-
### 1. Implement the provider helper
77+
### Implement the provider helper
6278

63-
Create a new file under packages/inference/src/providers/{provider_name}.ts and copy-paste the following snippet.
79+
Create a new file under `packages/inference/src/providers/{provider_name}.ts` and copy-paste the following snippet.
6480

6581
```ts
6682
import { TaskProviderHelper } from "./providerHelper";
@@ -86,7 +102,7 @@ export class MyNewProviderTask extends TaskProviderHelper {
86102
throw new Error("Needs to be implemented");
87103
}
88104

89-
getResponse(response: TogetherBase64ImageGeneration, outputType?: "url" | "blob"): string | Promise<Blob>{
105+
getResponse(response: unknown, outputType?: "url" | "blob"): string | Promise<Blob>{
90106
// Return the response in the expected format.
91107
throw new Error("Needs to be implemented");
92108
}
@@ -97,27 +113,24 @@ Implement the methods that require custom handling. Check out the base implement
97113

98114
If the provider supports multiple tasks that require different implementations, create dedicated subclasses for each task, following the pattern used in the existing providers implementation, e.g. [Together AI provider implementation](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/together.ts).
99115

100-
For text-generation and conversational tasks, one can just inherit from BaseTextGenerationTask and BaseConversationalTask respectively (defined in [providerHelper.ts]((https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/providerHelper.ts))) and override the methods if needed. Examples can be found in [Cerebras](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/cerebras.ts) or [Fireworks](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/fireworks.ts) provider implementations.
116+
For text-generation and conversational tasks, you can just inherit from BaseTextGenerationTask and BaseConversationalTask respectively (defined in [providerHelper.ts]((https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/providerHelper.ts))) and override the methods if needed. Examples can be found in [Cerebras](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/cerebras.ts) or [Fireworks](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/src/providers/fireworks.ts) provider implementations.
101117

102-
### 2. Register the provider
118+
### Register the provider
103119

104120
Go to [packages/inference/src/lib/getProviderHelper.ts](https://github.com/huggingface/huggingface.js//blob/main/packages/inference/src/lib/getProviderHelper.ts) and add your provider to `PROVIDERS`. Please try to respect alphabetical order.
105121

106-
### 3. Add tests
107122

108-
Go to [packages/inference/test/InferenceClient.spec.ts](https://github.com/huggingface/huggingface.js/blob/main/packages/inference/test/InferenceClient.spec.ts) and add new tests for each task supported by your provider.
123+
## 3. Model Mapping API
109124

125+
Congratulations! You now have a JS implementation to successfully make inference calls on your infra! Time to integrate with the Hub!
110126

111-
## 3. Model Mapping API
127+
First step is to use the Model Mapping API to register which HF models are supported.
112128

113-
Once you've verified the huggingface.js/inference client can call your models successfully, you
114-
can use our Model Mapping API.
129+
<Tip>
115130

116-
This API lets a Partner "register" that they support model X, Y or Z on HF.
131+
To proceed with this step, we have to enable your account server-side. Make sure you have an organization on the Hub for your enterprise.
117132

118-
This enables:
119-
- The inference widget on corresponding model pages
120-
- Inference compatibility throughout the HF ecosystem (Python and JS client SDKs for instance), and any downstream tool or library
133+
</Tip>
121134

122135
### Register a mapping item
123136

@@ -230,17 +243,16 @@ Here is an example of response:
230243

231244
For routed requests (see figure below), i.e. when users authenticate via HF, our intent is that
232245
our users only pay the standard provider API rates. There's no additional markup from us, we
233-
just pass through the provider costs directly.
246+
just pass through the provider costs directly.
247+
More details about the pricing structure can be found on the [pricing page](./pricing.md).
234248

235249
<div class="flex justify-center">
236250
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/inference-providers/types_of_billing.png"/>
237251
</div>
238252

239-
For LLM providers, a workaround some people use is to extract numbers of input and output
240-
tokens in the responses and multiply by a hardcoded pricing table – this is quite brittle, so we
241-
are reluctant to do this.
253+
242254
We propose an easier way to figure out this cost and charge it to our users, by asking you to
243-
provide the cost for each request via an HTTP API you host on your end.
255+
provide the cost for each request via an HTTP API you host on your end.
244256

245257
### HTTP API Specs
246258

@@ -313,8 +325,8 @@ Before adding a new provider to the `huggingface_hub` Python library, make sure
313325

314326
</Tip>
315327

316-
### 1. Implement the provider helper
317-
Create a new file under src/huggingface_hub/inference/_providers/{provider_name}.py and copy-paste the following snippet.
328+
### Implement the provider helper
329+
Create a new file under `src/huggingface_hub/inference/_providers/{provider_name}.py` and copy-paste the following snippet.
318330

319331
Implement the methods that require custom handling. Check out the base implementation to check default behavior. If you don't need to override a method, just remove it. At least one of `_prepare_payload_as_dict` or `_prepare_payload_as_bytes` must be overwritten.
320332

@@ -377,42 +389,12 @@ class MyNewProviderTaskProviderHelper(TaskProviderHelper):
377389
return super()._prepare_payload_as_bytes(inputs, parameters, mapped_model, extra_payload)
378390
```
379391

380-
### 2. Register the Provider
392+
### Register the Provider
381393
- Go to [src/huggingface_hub/inference/_providers/__init__.py](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/inference/_providers/__init__.py) and add your provider to `PROVIDER_T` and `PROVIDERS`. Please try to respect alphabetical order.
382394
- Go to [src/huggingface_hub/inference/_client.py](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/inference/_client.py) and update docstring in `InferenceClient.__init__` to document your provider.
383395

384-
### 3. Add tests
396+
### Add tests
385397
- Go to [tests/test_inference_providers.py](https://github.com/huggingface/huggingface_hub/blob/main/tests/test_inference_providers.py) and add static tests for overridden methods.
386-
- Go to [tests/test_inference_client.py](https://github.com/huggingface/huggingface_hub/blob/main/tests/test_inference_client.py) and add VCR tests:
387-
388-
389-
a. Add an entry to `_RECOMMENDED_MODELS_FOR_VCR` at the top of the test module. This contains a mapping task <> test model. model-id must be the HF model id.
390-
```python
391-
_RECOMMENDED_MODELS_FOR_VCR = {
392-
"your-provider": {
393-
"task": "model-id",
394-
...
395-
},
396-
...
397-
}
398-
```
399-
400-
b. Set up authentication: To record VCR cassettes, you'll need authentication:
401-
If you are a member of the provider organization (e.g., Replicate organization: https://huggingface.co/replicate), you can set the HF_INFERENCE_TEST_TOKEN environment variable with your HF token:
402-
403-
```bash
404-
export HF_INFERENCE_TEST_TOKEN="your-hf-token"
405-
```
406-
407-
If you're not a member but the provider is officially released on the Hub, you can set the HF_INFERENCE_TEST_TOKEN environment variable as above. If you don't have enough inference credits, we can help you record the VCR cassettes.
408-
409-
c. Record and commit tests
410-
Run the tests for your provider:
411-
```bash
412-
pytest tests/test_inference_client.py -k <provider>
413-
```
414-
415-
d. Commit the generated VCR cassettes with your PR.
416398

417399

418400
## FAQ
@@ -421,4 +403,4 @@ class MyNewProviderTaskProviderHelper(TaskProviderHelper):
421403

422404
**Answer:** The default sort is by total number of requests routed by HF over the last 7 days. This order defines which provider will be used in priority by the widget on the model page (but the user's order takes precedence).
423405

424-
## Get in touch!
406+

0 commit comments

Comments
 (0)