From 192799915b53197983ef0e8d06d2b6bcb9d4025c Mon Sep 17 00:00:00 2001 From: SBrandeis Date: Mon, 16 Jun 2025 21:41:26 +0200 Subject: [PATCH] update docs --- .../providers/featherless-ai.md | 2 +- .../providers/hf-inference.md | 81 ++++--- .../inference-providers/providers/together.md | 4 +- .../tasks/chat-completion.md | 2 +- .../tasks/text-classification.md | 36 ++-- .../tasks/text-generation.md | 204 +++++++++--------- 6 files changed, 180 insertions(+), 149 deletions(-) diff --git a/docs/inference-providers/providers/featherless-ai.md b/docs/inference-providers/providers/featherless-ai.md index 7deb14344..18d1f377e 100644 --- a/docs/inference-providers/providers/featherless-ai.md +++ b/docs/inference-providers/providers/featherless-ai.md @@ -56,7 +56,7 @@ Find out more about Chat Completion (VLM) [here](../tasks/chat-completion). diff --git a/docs/inference-providers/providers/hf-inference.md b/docs/inference-providers/providers/hf-inference.md index ee240d0b6..4ec33aae0 100644 --- a/docs/inference-providers/providers/hf-inference.md +++ b/docs/inference-providers/providers/hf-inference.md @@ -38,146 +38,163 @@ If you are interested in deploying models to a dedicated and autoscaling infrast ## Supported tasks + ### Automatic Speech Recognition Find out more about Automatic Speech Recognition [here](../tasks/automatic_speech_recognition). + ### Chat Completion (LLM) Find out more about Chat Completion (LLM) [here](../tasks/chat-completion). + ### Chat Completion (VLM) Find out more about Chat Completion (VLM) [here](../tasks/chat-completion). + ### Feature Extraction Find out more about Feature Extraction [here](../tasks/feature_extraction). + ### Fill Mask Find out more about Fill Mask [here](../tasks/fill_mask). + ### Image Classification Find out more about Image Classification [here](../tasks/image_classification). + ### Image Segmentation Find out more about Image Segmentation [here](../tasks/image_segmentation). + ### Object Detection Find out more about Object Detection [here](../tasks/object_detection). + ### Question Answering Find out more about Question Answering [here](../tasks/question_answering). + ### Summarization Find out more about Summarization [here](../tasks/summarization). + ### Table Question Answering Find out more about Table Question Answering [here](../tasks/table_question_answering). + ### Text Classification Find out more about Text Classification [here](../tasks/text_classification). + ### Text Generation Find out more about Text Generation [here](../tasks/text_generation). + ### Text To Image Find out more about Text To Image [here](../tasks/text_to_image). + ### Token Classification Find out more about Token Classification [here](../tasks/token_classification). + ### Translation Find out more about Translation [here](../tasks/translation). + diff --git a/docs/inference-providers/providers/together.md b/docs/inference-providers/providers/together.md index 426de4744..8ccce4cfa 100644 --- a/docs/inference-providers/providers/together.md +++ b/docs/inference-providers/providers/together.md @@ -44,7 +44,7 @@ Find out more about Chat Completion (LLM) [here](../tasks/chat-completion). @@ -64,7 +64,7 @@ Find out more about Text Generation [here](../tasks/text_generation). diff --git a/docs/inference-providers/tasks/chat-completion.md b/docs/inference-providers/tasks/chat-completion.md index 4f20bfefc..bc4ed7f70 100644 --- a/docs/inference-providers/tasks/chat-completion.md +++ b/docs/inference-providers/tasks/chat-completion.md @@ -61,7 +61,7 @@ The API supports: diff --git a/docs/inference-providers/tasks/text-classification.md b/docs/inference-providers/tasks/text-classification.md index 63f6551d7..fd2827abb 100644 --- a/docs/inference-providers/tasks/text-classification.md +++ b/docs/inference-providers/tasks/text-classification.md @@ -31,30 +31,36 @@ Explore all available models and find the one that suits you best [here](https:/ ### Using the API + + + ### API specification #### Request -| Headers | | | -| :---------------- | :------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Headers | | | +| :--- | :--- | :--- | | **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). | -| Payload | | | -| :-------------------------------------------------------------------- | :-------- | :-------------------------------------------------------------------- | -| **inputs\*** | _string_ | The text to classify | -| **parameters** | _object_ | | -| **        function_to_apply** | _enum_ | Possible values: sigmoid, softmax, none. | -| **        top_k** | _integer_ | When specified, limits the output to the top K most probable classes. | + +| Payload | | | +| :--- | :--- | :--- | +| **inputs*** | _string_ | The text to classify | +| **parameters** | _object_ | | +| **        function_to_apply** | _enum_ | Possible values: sigmoid, softmax, none. | +| **        top_k** | _integer_ | When specified, limits the output to the top K most probable classes. | + #### Response -| Body | | -| :-------------------------------------------------------- | :--------- | :----------------------------- | -| **(array)** | _object[]_ | Output is an array of objects. | -| **        label** | _string_ | The predicted class label. | -| **        score** | _number_ | The corresponding probability. | +| Body | | +| :--- | :--- | :--- | +| **(array)** | _object[]_ | Output is an array of objects. | +| **        label** | _string_ | The predicted class label. | +| **        score** | _number_ | The corresponding probability. | + diff --git a/docs/inference-providers/tasks/text-generation.md b/docs/inference-providers/tasks/text-generation.md index 224f27bb9..145eb357d 100644 --- a/docs/inference-providers/tasks/text-generation.md +++ b/docs/inference-providers/tasks/text-generation.md @@ -30,6 +30,7 @@ For more details about the `text-generation` task, check out its [dedicated page - [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B): Smaller variant of one of the most powerful models. - [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct): Very powerful text generation model trained to follow instructions. - [microsoft/phi-4](https://huggingface.co/microsoft/phi-4): Powerful text generation model by Microsoft. +- [Qwen/Qwen2.5-7B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-1M): Strong conversational model that supports very long instructions. - [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct): Text generation model used to write code. - [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1): Powerful reasoning based open large language model. @@ -37,120 +38,127 @@ Explore all available models and find the one that suits you best [here](https:/ ### Using the API + + + ### API specification #### Request -| Headers | | | -| :---------------- | :------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Headers | | | +| :--- | :--- | :--- | | **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). | -| Payload | | | -| :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| **inputs\*** | _string_ | | -| **parameters** | _object_ | | -| **        adapter_id** | _string_ | Lora adapter id | -| **        best_of** | _integer_ | Generate best_of sequences and return the one if the highest token logprobs. | -| **        decoder_input_details** | _boolean_ | Whether to return decoder input token logprobs and ids. | -| **        details** | _boolean_ | Whether to return generation details. | -| **        do_sample** | _boolean_ | Activate logits sampling. | -| **        frequency_penalty** | _number_ | The parameter for frequency penalty. 1.0 means no penalty Penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | -| **        grammar** | _unknown_ | One of the following: | -| **                 (#1)** | _object_ | | -| **                        type\*** | _enum_ | Possible values: json. | -| **                        value\*** | _unknown_ | A string that represents a [JSON Schema](https://json-schema.org/). JSON Schema is a declarative language that allows to annotate JSON documents with types and descriptions. | -| **                 (#2)** | _object_ | | -| **                        type\*** | _enum_ | Possible values: regex. | -| **                        value\*** | _string_ | | -| **                 (#3)** | _object_ | | -| **                        type\*** | _enum_ | Possible values: json_schema. | -| **                        value\*** | _object_ | | -| **                                name** | _string_ | Optional name identifier for the schema | -| **                                schema\*** | _unknown_ | The actual JSON schema definition | -| **        max_new_tokens** | _integer_ | Maximum number of tokens to generate. | -| **        repetition_penalty** | _number_ | The parameter for repetition penalty. 1.0 means no penalty. See [this paper](https://arxiv.org/pdf/1909.05858.pdf) for more details. | -| **        return_full_text** | _boolean_ | Whether to prepend the prompt to the generated text | -| **        seed** | _integer_ | Random sampling seed. | -| **        stop** | _string[]_ | Stop generating tokens if a member of `stop` is generated. | -| **        temperature** | _number_ | The value used to module the logits distribution. | -| **        top_k** | _integer_ | The number of highest probability vocabulary tokens to keep for top-k-filtering. | -| **        top_n_tokens** | _integer_ | The number of highest probability vocabulary tokens to keep for top-n-filtering. | -| **        top_p** | _number_ | Top-p value for nucleus sampling. | -| **        truncate** | _integer_ | Truncate inputs tokens to the given size. | -| **        typical_p** | _number_ | Typical Decoding mass See [Typical Decoding for Natural Language Generation](https://arxiv.org/abs/2202.00666) for more information. | -| **        watermark** | _boolean_ | Watermarking with [A Watermark for Large Language Models](https://arxiv.org/abs/2301.10226). | -| **stream** | _boolean_ | | + +| Payload | | | +| :--- | :--- | :--- | +| **inputs*** | _string_ | | +| **parameters** | _object_ | | +| **        adapter_id** | _string_ | Lora adapter id | +| **        best_of** | _integer_ | Generate best_of sequences and return the one if the highest token logprobs. | +| **        decoder_input_details** | _boolean_ | Whether to return decoder input token logprobs and ids. | +| **        details** | _boolean_ | Whether to return generation details. | +| **        do_sample** | _boolean_ | Activate logits sampling. | +| **        frequency_penalty** | _number_ | The parameter for frequency penalty. 1.0 means no penalty Penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | +| **        grammar** | _unknown_ | One of the following: | +| **                 (#1)** | _object_ | | +| **                        type*** | _enum_ | Possible values: json. | +| **                        value*** | _unknown_ | A string that represents a [JSON Schema](https://json-schema.org/). JSON Schema is a declarative language that allows to annotate JSON documents with types and descriptions. | +| **                 (#2)** | _object_ | | +| **                        type*** | _enum_ | Possible values: regex. | +| **                        value*** | _string_ | | +| **                 (#3)** | _object_ | | +| **                        type*** | _enum_ | Possible values: json_schema. | +| **                        value*** | _object_ | | +| **                                name** | _string_ | Optional name identifier for the schema | +| **                                schema*** | _unknown_ | The actual JSON schema definition | +| **        max_new_tokens** | _integer_ | Maximum number of tokens to generate. | +| **        repetition_penalty** | _number_ | The parameter for repetition penalty. 1.0 means no penalty. See [this paper](https://arxiv.org/pdf/1909.05858.pdf) for more details. | +| **        return_full_text** | _boolean_ | Whether to prepend the prompt to the generated text | +| **        seed** | _integer_ | Random sampling seed. | +| **        stop** | _string[]_ | Stop generating tokens if a member of `stop` is generated. | +| **        temperature** | _number_ | The value used to module the logits distribution. | +| **        top_k** | _integer_ | The number of highest probability vocabulary tokens to keep for top-k-filtering. | +| **        top_n_tokens** | _integer_ | The number of highest probability vocabulary tokens to keep for top-n-filtering. | +| **        top_p** | _number_ | Top-p value for nucleus sampling. | +| **        truncate** | _integer_ | Truncate inputs tokens to the given size. | +| **        typical_p** | _number_ | Typical Decoding mass See [Typical Decoding for Natural Language Generation](https://arxiv.org/abs/2202.00666) for more information. | +| **        watermark** | _boolean_ | Watermarking with [A Watermark for Large Language Models](https://arxiv.org/abs/2301.10226). | +| **stream** | _boolean_ | | + #### Response Output type depends on the `stream` input parameter. If `stream` is `false` (default), the response will be a JSON object with the following fields: -| Body | | -| :---------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------- | :------------------------------------------------- | -| **details** | _object_ | | -| **        best_of_sequences** | _object[]_ | | -| **                finish_reason** | _enum_ | Possible values: length, eos_token, stop_sequence. | -| **                generated_text** | _string_ | | -| **                generated_tokens** | _integer_ | | -| **                prefill** | _object[]_ | | -| **                        id** | _integer_ | | -| **                        logprob** | _number_ | | -| **                        text** | _string_ | | -| **                seed** | _integer_ | | -| **                tokens** | _object[]_ | | -| **                        id** | _integer_ | | -| **                        logprob** | _number_ | | -| **                        special** | _boolean_ | | -| **                        text** | _string_ | | -| **                top_tokens** | _array[]_ | | -| **                        id** | _integer_ | | -| **                        logprob** | _number_ | | -| **                        special** | _boolean_ | | -| **                        text** | _string_ | | -| **        finish_reason** | _enum_ | Possible values: length, eos_token, stop_sequence. | -| **        generated_tokens** | _integer_ | | -| **        prefill** | _object[]_ | | -| **                id** | _integer_ | | -| **                logprob** | _number_ | | -| **                text** | _string_ | | -| **        seed** | _integer_ | | -| **        tokens** | _object[]_ | | -| **                id** | _integer_ | | -| **                logprob** | _number_ | | -| **                special** | _boolean_ | | -| **                text** | _string_ | | -| **        top_tokens** | _array[]_ | | -| **                id** | _integer_ | | -| **                logprob** | _number_ | | -| **                special** | _boolean_ | | -| **                text** | _string_ | | -| **generated_text** | _string_ | | +| Body | | +| :--- | :--- | :--- | +| **details** | _object_ | | +| **        best_of_sequences** | _object[]_ | | +| **                finish_reason** | _enum_ | Possible values: length, eos_token, stop_sequence. | +| **                generated_text** | _string_ | | +| **                generated_tokens** | _integer_ | | +| **                prefill** | _object[]_ | | +| **                        id** | _integer_ | | +| **                        logprob** | _number_ | | +| **                        text** | _string_ | | +| **                seed** | _integer_ | | +| **                tokens** | _object[]_ | | +| **                        id** | _integer_ | | +| **                        logprob** | _number_ | | +| **                        special** | _boolean_ | | +| **                        text** | _string_ | | +| **                top_tokens** | _array[]_ | | +| **                        id** | _integer_ | | +| **                        logprob** | _number_ | | +| **                        special** | _boolean_ | | +| **                        text** | _string_ | | +| **        finish_reason** | _enum_ | Possible values: length, eos_token, stop_sequence. | +| **        generated_tokens** | _integer_ | | +| **        prefill** | _object[]_ | | +| **                id** | _integer_ | | +| **                logprob** | _number_ | | +| **                text** | _string_ | | +| **        seed** | _integer_ | | +| **        tokens** | _object[]_ | | +| **                id** | _integer_ | | +| **                logprob** | _number_ | | +| **                special** | _boolean_ | | +| **                text** | _string_ | | +| **        top_tokens** | _array[]_ | | +| **                id** | _integer_ | | +| **                logprob** | _number_ | | +| **                special** | _boolean_ | | +| **                text** | _string_ | | +| **generated_text** | _string_ | | + If `stream` is `true`, generated tokens are returned as a stream, using Server-Sent Events (SSE). For more information about streaming, check out [this guide](https://huggingface.co/docs/text-generation-inference/conceptual/streaming). -| Body | | -| :------------------------------------------------------------------- | :--------- | :------------------------------------------------- | -| **details** | _object_ | | -| **        finish_reason** | _enum_ | Possible values: length, eos_token, stop_sequence. | -| **        generated_tokens** | _integer_ | | -| **        input_length** | _integer_ | | -| **        seed** | _integer_ | | -| **generated_text** | _string_ | | -| **index** | _integer_ | | -| **token** | _object_ | | -| **        id** | _integer_ | | -| **        logprob** | _number_ | | -| **        special** | _boolean_ | | -| **        text** | _string_ | | -| **top_tokens** | _object[]_ | | -| **        id** | _integer_ | | -| **        logprob** | _number_ | | -| **        special** | _boolean_ | | -| **        text** | _string_ | | +| Body | | +| :--- | :--- | :--- | +| **details** | _object_ | | +| **        finish_reason** | _enum_ | Possible values: length, eos_token, stop_sequence. | +| **        generated_tokens** | _integer_ | | +| **        input_length** | _integer_ | | +| **        seed** | _integer_ | | +| **generated_text** | _string_ | | +| **index** | _integer_ | | +| **token** | _object_ | | +| **        id** | _integer_ | | +| **        logprob** | _number_ | | +| **        special** | _boolean_ | | +| **        text** | _string_ | | +| **top_tokens** | _object[]_ | | +| **        id** | _integer_ | | +| **        logprob** | _number_ | | +| **        special** | _boolean_ | | +| **        text** | _string_ | | +