Skip to content

Commit b1fad49

Browse files
Update API inference documentation (automated) (#1639)
Co-authored-by: hanouticelina <[email protected]>
1 parent 76161b6 commit b1fad49

File tree

6 files changed

+23
-24
lines changed

6 files changed

+23
-24
lines changed

docs/api-inference/tasks/chat-completion.md

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,12 @@ This is a subtask of [`text-generation`](https://huggingface.co/docs/api-inferen
2525
- [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B): Smaller variant of one of the most powerful models.
2626
- [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct): Very powerful text generation model trained to follow instructions.
2727
- [microsoft/phi-4](https://huggingface.co/microsoft/phi-4): Powerful text generation model by Microsoft.
28-
- [PowerInfer/SmallThinker-3B-Preview](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview): A very powerful model with reasoning capabilities.
2928
- [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct): Text generation model used to write code.
3029
- [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1): Powerful reasoning based open large language model.
3130

3231
#### Conversational Vision-Language Models (VLMs)
3332

34-
- [Qwen/QVQ-72B-Preview](https://huggingface.co/Qwen/QVQ-72B-Preview): Image-text-to-text model with reasoning capabilities.
33+
- [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct): Strong image-text-to-text model.
3534

3635
### API Playground
3736

@@ -214,11 +213,11 @@ To use the JavaScript client, see `huggingface.js`'s [package reference](https:/
214213

215214
<curl>
216215
```bash
217-
curl 'https://router.huggingface.co/hf-inference/models/Qwen/QVQ-72B-Preview/v1/chat/completions' \
216+
curl 'https://router.huggingface.co/hf-inference/models/Qwen/Qwen2.5-VL-7B-Instruct/v1/chat/completions' \
218217
-H 'Authorization: Bearer hf_***' \
219218
-H 'Content-Type: application/json' \
220219
--data '{
221-
"model": "Qwen/QVQ-72B-Preview",
220+
"model": "Qwen/Qwen2.5-VL-7B-Instruct",
222221
"messages": [
223222
{
224223
"role": "user",
@@ -271,7 +270,7 @@ messages = [
271270
]
272271

273272
stream = client.chat.completions.create(
274-
model="Qwen/QVQ-72B-Preview",
273+
model="Qwen/Qwen2.5-VL-7B-Instruct",
275274
messages=messages,
276275
max_tokens=500,
277276
stream=True
@@ -309,7 +308,7 @@ messages = [
309308
]
310309

311310
stream = client.chat.completions.create(
312-
model="Qwen/QVQ-72B-Preview",
311+
model="Qwen/Qwen2.5-VL-7B-Instruct",
313312
messages=messages,
314313
max_tokens=500,
315314
stream=True
@@ -332,7 +331,7 @@ const client = new HfInference("hf_***");
332331
let out = "";
333332

334333
const stream = client.chatCompletionStream({
335-
model: "Qwen/QVQ-72B-Preview",
334+
model: "Qwen/Qwen2.5-VL-7B-Instruct",
336335
messages: [
337336
{
338337
role: "user",
@@ -375,7 +374,7 @@ const client = new OpenAI({
375374
let out = "";
376375

377376
const stream = await client.chat.completions.create({
378-
model: "Qwen/QVQ-72B-Preview",
377+
model: "Qwen/Qwen2.5-VL-7B-Instruct",
379378
messages: [
380379
{
381380
role: "user",
@@ -458,7 +457,7 @@ To use the JavaScript client, see `huggingface.js`'s [package reference](https:/
458457
| **stop** | _string[]_ | Up to 4 sequences where the API will stop generating further tokens. |
459458
| **stream** | _boolean_ | |
460459
| **stream_options** | _object_ | |
461-
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;include_usage*** | _boolean_ | If set, an additional chunk will be streamed before the data: [DONE] message. The usage field on this chunk shows the token usage statistics for the entire request, and the choices field will always be an empty array. All other chunks will also include a usage field, but with a null value. |
460+
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;include_usage** | _boolean_ | If set, an additional chunk will be streamed before the data: [DONE] message. The usage field on this chunk shows the token usage statistics for the entire request, and the choices field will always be an empty array. All other chunks will also include a usage field, but with a null value. |
462461
| **temperature** | _number_ | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both. |
463462
| **tool_choice** | _unknown_ | One of the following: |
464463
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(#1)** | _enum_ | Possible values: auto. |
@@ -542,7 +541,7 @@ For more information about streaming, check out [this guide](https://huggingface
542541
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tool_call_id** | _string_ | |
543542
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(#2)** | _object_ | |
544543
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;role** | _string_ | |
545-
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tool_calls** | _object_ | |
544+
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tool_calls** | _object[]_ | |
546545
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;function** | _object_ | |
547546
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;arguments** | _string_ | |
548547
| **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;name** | _string_ | |

docs/api-inference/tasks/image-classification.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ For more details about the `image-classification` task, check out its [dedicated
2626

2727
- [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224): A strong image classification model.
2828
- [facebook/deit-base-distilled-patch16-224](https://huggingface.co/facebook/deit-base-distilled-patch16-224): A robust image classification model.
29+
- [facebook/convnext-large-224](https://huggingface.co/facebook/convnext-large-224): A strong image classification model.
2930

3031
Explore all available models and find the one that suits you best [here](https://huggingface.co/models?inference=warm&pipeline_tag=image-classification&sort=trending).
3132

docs/api-inference/tasks/image-text-to-text.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ For more details about the `image-text-to-text` task, check out its [dedicated p
2424

2525
### Recommended models
2626

27-
- [Qwen/QVQ-72B-Preview](https://huggingface.co/Qwen/QVQ-72B-Preview): Image-text-to-text model with reasoning capabilities.
27+
- [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct): Strong image-text-to-text model.
2828

2929
Explore all available models and find the one that suits you best [here](https://huggingface.co/models?inference=warm&pipeline_tag=image-text-to-text&sort=trending).
3030

@@ -35,7 +35,7 @@ Explore all available models and find the one that suits you best [here](https:/
3535

3636
<curl>
3737
```bash
38-
curl https://router.huggingface.co/hf-inference/models/Qwen/QVQ-72B-Preview \
38+
curl https://router.huggingface.co/hf-inference/models/Qwen/Qwen2.5-VL-7B-Instruct \
3939
-X POST \
4040
-d '{"inputs": "Can you please let us know more details about your "}' \
4141
-H 'Content-Type: application/json' \
@@ -56,7 +56,7 @@ client = InferenceClient(
5656
messages = "\"Can you please let us know more details about your \""
5757

5858
stream = client.chat.completions.create(
59-
model="Qwen/QVQ-72B-Preview",
59+
model="Qwen/Qwen2.5-VL-7B-Instruct",
6060
messages=messages,
6161
max_tokens=500,
6262
stream=True
@@ -78,7 +78,7 @@ client = OpenAI(
7878
messages = "\"Can you please let us know more details about your \""
7979

8080
stream = client.chat.completions.create(
81-
model="Qwen/QVQ-72B-Preview",
81+
model="Qwen/Qwen2.5-VL-7B-Instruct",
8282
messages=messages,
8383
max_tokens=500,
8484
stream=True
@@ -95,7 +95,7 @@ To use the Python client, see `huggingface_hub`'s [package reference](https://hu
9595
```js
9696
async function query(data) {
9797
const response = await fetch(
98-
"https://router.huggingface.co/hf-inference/models/Qwen/QVQ-72B-Preview",
98+
"https://router.huggingface.co/hf-inference/models/Qwen/Qwen2.5-VL-7B-Instruct",
9999
{
100100
headers: {
101101
Authorization: "Bearer hf_***",

docs/api-inference/tasks/text-generation.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ For more details, check out:
1616

1717
Generate text based on a prompt.
1818

19-
If you are interested in a Chat Completion task, which generates a response based on a list of messages, check out the [`chat-completion`](./chat-completion) task.
19+
If you are interested in a Chat Completion task, which generates a response based on a list of messages, check out the [`chat-completion`](./chat_completion) task.
2020

2121
<Tip>
2222

@@ -30,7 +30,6 @@ For more details about the `text-generation` task, check out its [dedicated page
3030
- [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B): Smaller variant of one of the most powerful models.
3131
- [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct): Very powerful text generation model trained to follow instructions.
3232
- [microsoft/phi-4](https://huggingface.co/microsoft/phi-4): Powerful text generation model by Microsoft.
33-
- [PowerInfer/SmallThinker-3B-Preview](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview): A very powerful model with reasoning capabilities.
3433
- [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct): Text generation model used to write code.
3534
- [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1): Powerful reasoning based open large language model.
3635

scripts/api-inference/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"license": "ISC",
1616
"dependencies": {
1717
"@huggingface/inference": "^3.5.0",
18-
"@huggingface/tasks": "^0.17.0",
18+
"@huggingface/tasks": "^0.17.4",
1919
"@types/node": "^22.5.0",
2020
"handlebars": "^4.7.8",
2121
"node": "^20.17.0",

scripts/api-inference/pnpm-lock.yaml

Lines changed: 6 additions & 6 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)