Skip to content

Commit b98a6be

Browse files
committed
Merge branch 'main' into list-providers
2 parents ad77758 + 308baa8 commit b98a6be

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+248
-227
lines changed

docs/inference-providers/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@
2929
title: Feature Extraction
3030
- local: tasks/text-to-image
3131
title: Text to Image
32+
- local: tasks/text-to-video
33+
title: Text to Video
3234
- title: Other Tasks
3335
sections:
3436
- local: tasks/audio-classification

docs/inference-providers/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,6 @@ console.log(chatCompletion.choices[0].message);
179179
In this introduction, we've covered the basics of Inference Providers. To learn more about this service, check out our guides and API Reference:
180180
- [Pricing and Billing](./pricing): everything you need to know about billing.
181181
- [Hub integration](./hub-integration): how is Inference Providers integrated with the Hub?
182-
- [Register as an Inference Provider](./register-as-a-provider.md): everything about how to become an official partner.
182+
- [Register as an Inference Provider](./register-as-a-provider): everything about how to become an official partner.
183183
- [Hub API](./hub-api): high-level API for Inference Providers.
184184
- [API Reference](./tasks/index): learn more about the parameters and task-specific settings.

docs/inference-providers/tasks/audio-classification.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,11 @@ No snippet available for this task.
4646

4747
#### Request
4848

49+
| Headers | | |
50+
| :--- | :--- | :--- |
51+
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
52+
53+
4954
| Payload | | |
5055
| :--- | :--- | :--- |
5156
| **inputs*** | _string_ | The input audio data as a base64-encoded string. If no `parameters` are provided, you can also provide the audio data as a raw bytes payload. |
@@ -54,16 +59,6 @@ No snippet available for this task.
5459
| **        top_k** | _integer_ | When specified, limits the output to the top K most probable classes. |
5560

5661

57-
Some options can be configured by passing headers to the Inference API. Here are the available headers:
58-
59-
| Headers | | |
60-
| :--- | :--- | :--- |
61-
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
62-
| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
63-
| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
64-
65-
For more information about Inference API headers, check out the parameters [guide](../parameters).
66-
6762
#### Response
6863

6964
| Body | |

docs/inference-providers/tasks/automatic-speech-recognition.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@ Explore all available models and find the one that suits you best [here](https:/
4848

4949
#### Request
5050

51+
| Headers | | |
52+
| :--- | :--- | :--- |
53+
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
54+
55+
5156
| Payload | | |
5257
| :--- | :--- | :--- |
5358
| **inputs*** | _string_ | The input audio data as a base64-encoded string. If no `parameters` are provided, you can also provide the audio data as a raw bytes payload. |
@@ -72,16 +77,6 @@ Explore all available models and find the one that suits you best [here](https:/
7277
| **                use_cache** | _boolean_ | Whether the model should use the past last key/values attentions to speed up decoding |
7378

7479

75-
Some options can be configured by passing headers to the Inference API. Here are the available headers:
76-
77-
| Headers | | |
78-
| :--- | :--- | :--- |
79-
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
80-
| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
81-
| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
82-
83-
For more information about Inference API headers, check out the parameters [guide](../parameters).
84-
8580
#### Response
8681

8782
| Body | |

docs/inference-providers/tasks/chat-completion.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,11 @@ conversational />
7979

8080
#### Request
8181

82+
| Headers | | |
83+
| :--- | :--- | :--- |
84+
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
85+
86+
8287
| Payload | | |
8388
| :--- | :--- | :--- |
8489
| **frequency_penalty** | _number_ | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. |
@@ -140,16 +145,6 @@ conversational />
140145
| **top_p** | _number_ | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. |
141146

142147

143-
Some options can be configured by passing headers to the Inference API. Here are the available headers:
144-
145-
| Headers | | |
146-
| :--- | :--- | :--- |
147-
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
148-
| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
149-
| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
150-
151-
For more information about Inference API headers, check out the parameters [guide](../parameters).
152-
153148
#### Response
154149

155150
Output type depends on the `stream` input parameter.

docs/inference-providers/tasks/feature-extraction.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,11 @@ Explore all available models and find the one that suits you best [here](https:/
4747

4848
#### Request
4949

50+
| Headers | | |
51+
| :--- | :--- | :--- |
52+
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
53+
54+
5055
| Payload | | |
5156
| :--- | :--- | :--- |
5257
| **inputs*** | _unknown_ | One of the following: |
@@ -58,16 +63,6 @@ Explore all available models and find the one that suits you best [here](https:/
5863
| **truncation_direction** | _enum_ | Possible values: Left, Right. |
5964

6065

61-
Some options can be configured by passing headers to the Inference API. Here are the available headers:
62-
63-
| Headers | | |
64-
| :--- | :--- | :--- |
65-
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
66-
| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
67-
| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
68-
69-
For more information about Inference API headers, check out the parameters [guide](../parameters).
70-
7166
#### Response
7267

7368
| Body | |

docs/inference-providers/tasks/fill-mask.md

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -31,14 +31,22 @@ Explore all available models and find the one that suits you best [here](https:/
3131
### Using the API
3232

3333

34-
No snippet available for this task.
34+
<InferenceSnippet
35+
pipeline=fill-mask
36+
providersMapping={ {"hf-inference":{"modelId":"Rostlab/prot_bert","providerModelId":"Rostlab/prot_bert"}} }
37+
/>
3538

3639

3740

3841
### API specification
3942

4043
#### Request
4144

45+
| Headers | | |
46+
| :--- | :--- | :--- |
47+
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
48+
49+
4250
| Payload | | |
4351
| :--- | :--- | :--- |
4452
| **inputs*** | _string_ | The text with masked tokens |
@@ -47,16 +55,6 @@ No snippet available for this task.
4755
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;targets** | _string[]_ | When passed, the model will limit the scores to the passed targets instead of looking up in the whole vocabulary. If the provided targets are not in the model vocab, they will be tokenized and the first resulting token will be used (with a warning, and that might be slower). |
4856

4957

50-
Some options can be configured by passing headers to the Inference API. Here are the available headers:
51-
52-
| Headers | | |
53-
| :--- | :--- | :--- |
54-
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
55-
| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
56-
| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
57-
58-
For more information about Inference API headers, check out the parameters [guide](../parameters).
59-
6058
#### Response
6159

6260
| Body | |

docs/inference-providers/tasks/image-classification.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,11 @@ Explore all available models and find the one that suits you best [here](https:/
4444

4545
#### Request
4646

47+
| Headers | | |
48+
| :--- | :--- | :--- |
49+
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
50+
51+
4752
| Payload | | |
4853
| :--- | :--- | :--- |
4954
| **inputs*** | _string_ | The input image data as a base64-encoded string. If no `parameters` are provided, you can also provide the image data as a raw bytes payload. |
@@ -52,16 +57,6 @@ Explore all available models and find the one that suits you best [here](https:/
5257
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;top_k** | _integer_ | When specified, limits the output to the top K most probable classes. |
5358

5459

55-
Some options can be configured by passing headers to the Inference API. Here are the available headers:
56-
57-
| Headers | | |
58-
| :--- | :--- | :--- |
59-
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
60-
| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
61-
| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
62-
63-
For more information about Inference API headers, check out the parameters [guide](../parameters).
64-
6560
#### Response
6661

6762
| Body | |

docs/inference-providers/tasks/image-segmentation.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,11 @@ Explore all available models and find the one that suits you best [here](https:/
4343

4444
#### Request
4545

46+
| Headers | | |
47+
| :--- | :--- | :--- |
48+
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
49+
50+
4651
| Payload | | |
4752
| :--- | :--- | :--- |
4853
| **inputs*** | _string_ | The input image data as a base64-encoded string. If no `parameters` are provided, you can also provide the image data as a raw bytes payload. |
@@ -53,16 +58,6 @@ Explore all available models and find the one that suits you best [here](https:/
5358
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;threshold** | _number_ | Probability threshold to filter out predicted masks. |
5459

5560

56-
Some options can be configured by passing headers to the Inference API. Here are the available headers:
57-
58-
| Headers | | |
59-
| :--- | :--- | :--- |
60-
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
61-
| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
62-
| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
63-
64-
For more information about Inference API headers, check out the parameters [guide](../parameters).
65-
6661
#### Response
6762

6863
| Body | |

docs/inference-providers/tasks/image-to-image.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,11 @@ Explore all available models and find the one that suits you best [here](https:/
4646

4747
#### Request
4848

49+
| Headers | | |
50+
| :--- | :--- | :--- |
51+
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
52+
53+
4954
| Payload | | |
5055
| :--- | :--- | :--- |
5156
| **inputs*** | _string_ | The input image data as a base64-encoded string. If no `parameters` are provided, you can also provide the image data as a raw bytes payload. |
@@ -59,16 +64,6 @@ Explore all available models and find the one that suits you best [here](https:/
5964
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;height*** | _integer_ | |
6065

6166

62-
Some options can be configured by passing headers to the Inference API. Here are the available headers:
63-
64-
| Headers | | |
65-
| :--- | :--- | :--- |
66-
| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
67-
| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
68-
| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
69-
70-
For more information about Inference API headers, check out the parameters [guide](../parameters).
71-
7267
#### Response
7368

7469
| Body | |

0 commit comments

Comments
 (0)