Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 10 additions & 8 deletions docs/api-inference/tasks/chat-completion.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,15 @@ This is a subtask of [`text-generation`](https://huggingface.co/docs/api-inferen

- [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it): A text-generation model trained to follow instructions.
- [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct): Very powerful text generation model trained to follow instructions.
- [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct): Small yet powerful text generation model.
- [microsoft/phi-4](https://huggingface.co/microsoft/phi-4): Powerful text generation model by Microsoft.
- [PowerInfer/SmallThinker-3B-Preview](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview): A very powerful model with reasoning capabilities.
- [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct): Strong text generation model to follow instructions.
- [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct): Text generation model used to write code.

#### Conversational Vision-Language Models (VLMs)

- [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct): Powerful vision language model with great visual understanding and reasoning capabilities.
- [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct): Strong image-text-to-text model.
- [Qwen/QVQ-72B-Preview](https://huggingface.co/Qwen/QVQ-72B-Preview): Image-text-to-text model with reasoning capabilities.

### API Playground

Expand Down Expand Up @@ -208,11 +210,11 @@ To use the JavaScript client, see `huggingface.js`'s [package reference](https:/

<curl>
```bash
curl 'https://api-inference.huggingface.co/models/meta-llama/Llama-3.2-11B-Vision-Instruct/v1/chat/completions' \
curl 'https://api-inference.huggingface.co/models/Qwen/Qwen2-VL-7B-Instruct/v1/chat/completions' \
-H 'Authorization: Bearer hf_***' \
-H 'Content-Type: application/json' \
--data '{
"model": "meta-llama/Llama-3.2-11B-Vision-Instruct",
"model": "Qwen/Qwen2-VL-7B-Instruct",
"messages": [
{
"role": "user",
Expand Down Expand Up @@ -262,7 +264,7 @@ messages = [
]

stream = client.chat.completions.create(
model="meta-llama/Llama-3.2-11B-Vision-Instruct",
model="Qwen/Qwen2-VL-7B-Instruct",
messages=messages,
max_tokens=500,
stream=True
Expand Down Expand Up @@ -300,7 +302,7 @@ messages = [
]

stream = client.chat.completions.create(
model="meta-llama/Llama-3.2-11B-Vision-Instruct",
model="Qwen/Qwen2-VL-7B-Instruct",
messages=messages,
max_tokens=500,
stream=True
Expand All @@ -323,7 +325,7 @@ const client = new HfInference("hf_***");
let out = "";

const stream = client.chatCompletionStream({
model: "meta-llama/Llama-3.2-11B-Vision-Instruct",
model: "Qwen/Qwen2-VL-7B-Instruct",
messages: [
{
role: "user",
Expand Down Expand Up @@ -365,7 +367,7 @@ const client = new OpenAI({
let out = "";

const stream = await client.chat.completions.create({
model: "meta-llama/Llama-3.2-11B-Vision-Instruct",
model: "Qwen/Qwen2-VL-7B-Instruct",
messages: [
{
role: "user",
Expand Down
7 changes: 3 additions & 4 deletions docs/api-inference/tasks/fill-mask.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ For more details about the `fill-mask` task, check out its [dedicated page](http

### Recommended models

- [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased): The famous BERT model.
- [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base): A multilingual model trained on 100 languages.

Explore all available models and find the one that suits you best [here](https://huggingface.co/models?inference=warm&pipeline_tag=fill-mask&sort=trending).
Expand All @@ -36,7 +35,7 @@ Explore all available models and find the one that suits you best [here](https:/

<curl>
```bash
curl https://api-inference.huggingface.co/models/google-bert/bert-base-uncased \
curl https://api-inference.huggingface.co/models/FacebookAI/xlm-roberta-base \
-X POST \
-d '{"inputs": "The answer to the universe is [MASK]."}' \
-H 'Content-Type: application/json' \
Expand All @@ -48,7 +47,7 @@ curl https://api-inference.huggingface.co/models/google-bert/bert-base-uncased \
```py
import requests

API_URL = "https://api-inference.huggingface.co/models/google-bert/bert-base-uncased"
API_URL = "https://api-inference.huggingface.co/models/FacebookAI/xlm-roberta-base"
headers = {"Authorization": "Bearer hf_***"}

def query(payload):
Expand All @@ -67,7 +66,7 @@ To use the Python client, see `huggingface_hub`'s [package reference](https://hu
```js
async function query(data) {
const response = await fetch(
"https://api-inference.huggingface.co/models/google-bert/bert-base-uncased",
"https://api-inference.huggingface.co/models/FacebookAI/xlm-roberta-base",
{
headers: {
Authorization: "Bearer hf_***",
Expand Down
10 changes: 5 additions & 5 deletions docs/api-inference/tasks/image-text-to-text.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ For more details about the `image-text-to-text` task, check out its [dedicated p

### Recommended models

- [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct): Powerful vision language model with great visual understanding and reasoning capabilities.
- [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct): Strong image-text-to-text model.
- [Qwen/QVQ-72B-Preview](https://huggingface.co/Qwen/QVQ-72B-Preview): Image-text-to-text model with reasoning capabilities.

Explore all available models and find the one that suits you best [here](https://huggingface.co/models?inference=warm&pipeline_tag=image-text-to-text&sort=trending).

Expand All @@ -36,7 +36,7 @@ Explore all available models and find the one that suits you best [here](https:/

<curl>
```bash
curl https://api-inference.huggingface.co/models/meta-llama/Llama-3.2-11B-Vision-Instruct \
curl https://api-inference.huggingface.co/models/Qwen/Qwen2-VL-7B-Instruct \
-X POST \
-d '{"inputs": "Can you please let us know more details about your "}' \
-H 'Content-Type: application/json' \
Expand All @@ -54,7 +54,7 @@ client = InferenceClient(api_key="hf_***")
messages = "\"Can you please let us know more details about your \""

stream = client.chat.completions.create(
model="meta-llama/Llama-3.2-11B-Vision-Instruct",
model="Qwen/Qwen2-VL-7B-Instruct",
messages=messages,
max_tokens=500,
stream=True
Expand All @@ -76,7 +76,7 @@ client = OpenAI(
messages = "\"Can you please let us know more details about your \""

stream = client.chat.completions.create(
model="meta-llama/Llama-3.2-11B-Vision-Instruct",
model="Qwen/Qwen2-VL-7B-Instruct",
messages=messages,
max_tokens=500,
stream=True
Expand All @@ -93,7 +93,7 @@ To use the Python client, see `huggingface_hub`'s [package reference](https://hu
```js
async function query(data) {
const response = await fetch(
"https://api-inference.huggingface.co/models/meta-llama/Llama-3.2-11B-Vision-Instruct",
"https://api-inference.huggingface.co/models/Qwen/Qwen2-VL-7B-Instruct",
{
headers: {
Authorization: "Bearer hf_***",
Expand Down
3 changes: 1 addition & 2 deletions docs/api-inference/tasks/image-to-image.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ For more details about the `image-to-image` task, check out its [dedicated page]

### Recommended models

- [timbrooks/instruct-pix2pix](https://huggingface.co/timbrooks/instruct-pix2pix): A model that takes an image and an instruction to edit the image.

Explore all available models and find the one that suits you best [here](https://huggingface.co/models?inference=warm&pipeline_tag=image-to-image&sort=trending).

Expand All @@ -49,7 +48,7 @@ No snippet available for this task.
| **inputs*** | _string_ | The input image data as a base64-encoded string. If no `parameters` are provided, you can also provide the image data as a raw bytes payload. |
| **parameters** | _object_ | |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;guidance_scale** | _number_ | For diffusion models. A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;negative_prompt** | _string[]_ | One or several prompt to guide what NOT to include in image generation. |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;negative_prompt** | _string_ | One prompt to guide what NOT to include in image generation. |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;num_inference_steps** | _integer_ | For diffusion models. The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;target_size** | _object_ | The size in pixel of the output image. |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;width*** | _integer_ | |
Expand Down
1 change: 0 additions & 1 deletion docs/api-inference/tasks/question-answering.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ For more details about the `question-answering` task, check out its [dedicated p

- [deepset/roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2): A robust baseline model for most question answering domains.
- [distilbert/distilbert-base-cased-distilled-squad](https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad): Small yet robust model that can answer questions.
- [google/tapas-base-finetuned-wtq](https://huggingface.co/google/tapas-base-finetuned-wtq): A special model that can answer questions from tables.

Explore all available models and find the one that suits you best [here](https://huggingface.co/models?inference=warm&pipeline_tag=question-answering&sort=trending).

Expand Down
7 changes: 3 additions & 4 deletions docs/api-inference/tasks/table-question-answering.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ For more details about the `table-question-answering` task, check out its [dedic

### Recommended models

- [google/tapas-base-finetuned-wtq](https://huggingface.co/google/tapas-base-finetuned-wtq): A robust table question answering model.

Explore all available models and find the one that suits you best [here](https://huggingface.co/models?inference=warm&pipeline_tag=table-question-answering&sort=trending).

Expand All @@ -35,7 +34,7 @@ Explore all available models and find the one that suits you best [here](https:/

<curl>
```bash
curl https://api-inference.huggingface.co/models/google/tapas-base-finetuned-wtq \
curl https://api-inference.huggingface.co/models/<REPO_ID> \
-X POST \
-d '{"inputs": { "query": "How many stars does the transformers repository have?", "table": { "Repository": ["Transformers", "Datasets", "Tokenizers"], "Stars": ["36542", "4512", "3934"], "Contributors": ["651", "77", "34"], "Programming language": [ "Python", "Python", "Rust, Python and NodeJS" ] } }}' \
-H 'Content-Type: application/json' \
Expand All @@ -47,7 +46,7 @@ curl https://api-inference.huggingface.co/models/google/tapas-base-finetuned-wtq
```py
import requests

API_URL = "https://api-inference.huggingface.co/models/google/tapas-base-finetuned-wtq"
API_URL = "https://api-inference.huggingface.co/models/<REPO_ID>"
headers = {"Authorization": "Bearer hf_***"}

def query(payload):
Expand Down Expand Up @@ -78,7 +77,7 @@ To use the Python client, see `huggingface_hub`'s [package reference](https://hu
```js
async function query(data) {
const response = await fetch(
"https://api-inference.huggingface.co/models/google/tapas-base-finetuned-wtq",
"https://api-inference.huggingface.co/models/<REPO_ID>",
{
headers: {
Authorization: "Bearer hf_***",
Expand Down
4 changes: 3 additions & 1 deletion docs/api-inference/tasks/text-generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,10 @@ For more details about the `text-generation` task, check out its [dedicated page

- [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it): A text-generation model trained to follow instructions.
- [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct): Very powerful text generation model trained to follow instructions.
- [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct): Small yet powerful text generation model.
- [microsoft/phi-4](https://huggingface.co/microsoft/phi-4): Powerful text generation model by Microsoft.
- [PowerInfer/SmallThinker-3B-Preview](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview): A very powerful model with reasoning capabilities.
- [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct): Strong text generation model to follow instructions.
- [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct): Text generation model used to write code.

Explore all available models and find the one that suits you best [here](https://huggingface.co/models?inference=warm&pipeline_tag=text-generation&sort=trending).

Expand Down
2 changes: 1 addition & 1 deletion docs/api-inference/tasks/text-to-image.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ To use the JavaScript client, see `huggingface.js`'s [package reference](https:/
| **inputs*** | _string_ | The input text data (sometimes called "prompt") |
| **parameters** | _object_ | |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;guidance_scale** | _number_ | A higher guidance scale value encourages the model to generate images closely linked to the text prompt, but values too high may cause saturation and other artifacts. |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;negative_prompt** | _string[]_ | One or several prompt to guide what NOT to include in image generation. |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;negative_prompt** | _string_ | One prompt to guide what NOT to include in image generation. |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;num_inference_steps** | _integer_ | The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;target_size** | _object_ | The size in pixel of the output image |
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;width*** | _integer_ | |
Expand Down
1 change: 0 additions & 1 deletion docs/api-inference/tasks/zero-shot-classification.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ For more details about the `zero-shot-classification` task, check out its [dedic
### Recommended models

- [facebook/bart-large-mnli](https://huggingface.co/facebook/bart-large-mnli): Powerful zero-shot text classification model.
- [MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7): Powerful zero-shot multilingual text classification model that can accomplish multiple tasks.

Explore all available models and find the one that suits you best [here](https://huggingface.co/models?inference=warm&pipeline_tag=zero-shot-classification&sort=trending).

Expand Down
2 changes: 1 addition & 1 deletion scripts/api-inference/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"author": "",
"license": "ISC",
"dependencies": {
"@huggingface/tasks": "^0.13.14",
"@huggingface/tasks": "^0.14.0",
"@types/node": "^22.5.0",
"handlebars": "^4.7.8",
"node": "^20.17.0",
Expand Down
10 changes: 5 additions & 5 deletions scripts/api-inference/pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading