Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/api_inference_build_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: Build Inference API documentation
on:
push:
paths:
- "docs/api-inference/**"
- "docs/inference-providers/**"
branches:
- main

Expand All @@ -13,8 +13,8 @@ jobs:
with:
commit_sha: ${{ github.sha }}
package: hub-docs
package_name: api-inference
path_to_docs: hub-docs/docs/api-inference/
package_name: inference-providers
path_to_docs: hub-docs/docs/inference-providers/
additional_args: --not_python_module
secrets:
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
6 changes: 3 additions & 3 deletions .github/workflows/api_inference_build_pr_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: Build Inference API PR Documentation
on:
pull_request:
paths:
- "docs/api-inference/**"
- "docs/inference-providers/**"

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
Expand All @@ -16,6 +16,6 @@ jobs:
commit_sha: ${{ github.event.pull_request.head.sha }}
pr_number: ${{ github.event.number }}
package: hub-docs
package_name: api-inference
path_to_docs: hub-docs/docs/api-inference/
package_name: inference-providers
path_to_docs: hub-docs/docs/inference-providers/
additional_args: --not_python_module
14 changes: 7 additions & 7 deletions .github/workflows/api_inference_generate_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,23 +23,23 @@ jobs:
with:
run_install: |
- recursive: true
cwd: ./scripts/api-inference
cwd: ./scripts/inference-providers
args: [--frozen-lockfile]
package_json_file: ./scripts/api-inference/package.json
package_json_file: ./scripts/inference-providers/package.json
- name: Update huggingface/tasks package
working-directory: ./scripts/api-inference
working-directory: ./scripts/inference-providers
run: |
pnpm update @huggingface/tasks@latest
# Generate
- name: Generate API inference documentation
run: pnpm run generate
working-directory: ./scripts/api-inference
working-directory: ./scripts/inference-providers

# Check changes
- name: Check changes
run: |
git diff --name-only > changed_files.txt
if grep -v -E "^(scripts/api-inference/package.json|scripts/api-inference/pnpm-lock.yaml)$" changed_files.txt | grep -q '.'; then
if grep -v -E "^(scripts/inference-providers/package.json|scripts/inference-providers/pnpm-lock.yaml)$" changed_files.txt | grep -q '.'; then
echo "changes_detected=true" >> $GITHUB_ENV
else
echo "changes_detected=false" >> $GITHUB_ENV
Expand All @@ -58,13 +58,13 @@ jobs:
with:
token: ${{ secrets.TOKEN_INFERENCE_SYNC_BOT }}
commit-message: Update API inference documentation (automated)
branch: update-api-inference-docs-automated-pr
branch: update-inference-providers-docs-automated-pr
delete-branch: true
title: "[Bot] Update API inference documentation"
body: |
This PR automatically upgrades the `@huggingface/tasks` package and regenerates the API inference documentation by running:
```sh
cd scripts/api-inference
cd scripts/inference-providers
pnpm update @huggingface/tasks@latest
pnpm run generate
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
build:
uses: huggingface/doc-builder/.github/workflows/upload_pr_documentation.yml@main
with:
package_name: api-inference
package_name: inference-providers
secrets:
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
comment_bot_token: ${{ secrets.COMMENT_BOT_TOKEN }}
7 changes: 4 additions & 3 deletions docs/hub/billing.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,10 @@ Private repository storage above the [included storage](./storage-limits) will b

The PRO subscription unlocks additional features for users, including:

- Higher free tier for the Serverless Inference API and when consuming ZeroGPU Spaces
- Higher [storage capacity](./storage-limits) for private repositories
- Higher tier for ZeroGPU Spaces usage
- Ability to create ZeroGPU Spaces and use Dev Mode
- Included credits for [Inference Providers](/docs/inference-providers/)
- Higher [storage capacity](./storage-limits) for private repositories
- Ability to write Social Posts and Community Blogs
- Leverage the Dataset Viewer on private datasets

Expand All @@ -48,7 +49,7 @@ It is billed with the renewal invoices of your PRO or Enterprise Hub subscriptio

## Compute Services on the Hub

We also directly provide compute services with [Spaces](./spaces), [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) and the [Serverless Inference API](https://huggingface.co/docs/api-inference/index).
We also directly provide compute services with [Spaces](./spaces), [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) and [Inference Providers](https://huggingface.co/docs/inference-providers/index).

While most of our compute services have a comprehensive free tier, users and organizations can pay to access more powerful hardware accelerators.

Expand Down
14 changes: 7 additions & 7 deletions docs/hub/models-inference.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Serverless Inference API
# Inference Providers

Please refer to [Serverless Inference API Documentation](https://huggingface.co/docs/api-inference) for detailed information.
Please refer to the [Inference Providers Documentation](https://huggingface.co/docs/inference-providers) for detailed information.


## What technology do you use to power the Serverless Inference API?
## What technology do you use to power the HF-Inference API?

For 🤗 Transformers models, [Pipelines](https://huggingface.co/docs/transformers/main_classes/pipelines) power the API.

Expand All @@ -14,13 +14,13 @@ On top of `Pipelines` and depending on the model type, there are several product

For models from [other libraries](./models-libraries), the API uses [Starlette](https://www.starlette.io) and runs in [Docker containers](https://github.com/huggingface/api-inference-community/tree/main/docker_images). Each library defines the implementation of [different pipelines](https://github.com/huggingface/api-inference-community/tree/main/docker_images/sentence_transformers/app/pipelines).

## How can I turn off the Serverless Inference API for my model?
## How can I turn off the HF-Inference API for my model?

Specify `inference: false` in your model card's metadata.

## Why don't I see an inference widget, or why can't I use the API?

For some tasks, there might not be support in the Serverless Inference API, and, hence, there is no widget.
For some tasks, there might not be support in the HF-Inference API, and, hence, there is no widget.
For all libraries (except 🤗 Transformers), there is a [library-to-tasks.ts file](https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/library-to-tasks.ts) of supported tasks in the API. When a model repository has a task that is not supported by the repository library, the repository has `inference: false` by default.

## Can I send large volumes of requests? Can I get accelerated APIs?
Expand All @@ -31,6 +31,6 @@ If you are interested in accelerated inference, higher volumes of requests, or a

You can check your usage in the [Inference Dashboard](https://ui.endpoints.huggingface.co/endpoints). The dashboard shows both your serverless and dedicated endpoints usage.

## Is there programmatic access to the Serverless Inference API?
## Is there programmatic access to the HF-Inference API?

Yes, the `huggingface_hub` library has a client wrapper documented [here](https://huggingface.co/docs/huggingface_hub/how-to-inference).
Yes, the `huggingface_hub` library has a client wrapper documented [here](https://huggingface.co/docs/huggingface_hub/guides/inference).
2 changes: 1 addition & 1 deletion docs/hub/models-the-hub.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## What is the Model Hub?

The Model Hub is where the members of the Hugging Face community can host all of their model checkpoints for simple storage, discovery, and sharing. Download pre-trained models with the [`huggingface_hub` client library](https://huggingface.co/docs/huggingface_hub/index), with 🤗 [`Transformers`](https://huggingface.co/docs/transformers/index) for fine-tuning and other usages or with any of the over [15 integrated libraries](./models-libraries). You can even leverage the [Serverless Inference API](./models-inference) or [Inference Endpoints](https://huggingface.co/docs/inference-endpoints). to use models in production settings.
The Model Hub is where the members of the Hugging Face community can host all of their model checkpoints for simple storage, discovery, and sharing. Download pre-trained models with the [`huggingface_hub` client library](https://huggingface.co/docs/huggingface_hub/index), with 🤗 [`Transformers`](https://huggingface.co/docs/transformers/index) for fine-tuning and other usages or with any of the over [15 integrated libraries](./models-libraries). You can even leverage [Inference Providers](/docs/inference-providers/) or [Inference Endpoints](https://huggingface.co/docs/inference-endpoints) to use models in production settings.

You can refer to the following video for a guide on navigating the Model Hub:

Expand Down
8 changes: 5 additions & 3 deletions docs/hub/models-widgets.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,9 +168,9 @@ Here are some links to examples:
- `table-question-answering`, for instance [`google/tapas-base-finetuned-wtq`](https://huggingface.co/google/tapas-base-finetuned-wtq)
- `sentence-similarity`, for instance [`osanseviero/full-sentence-distillroberta2`](/osanseviero/full-sentence-distillroberta2)

## How can I control my model's widget Inference API parameters?
## How can I control my model's widget HF-Inference API parameters?

Generally, the Inference API for a model uses the default pipeline settings associated with each task. But if you'd like to change the pipeline's default settings and specify additional inference parameters, you can configure the parameters directly through the model card metadata. Refer [here](https://huggingface.co/docs/api-inference/detailed_parameters) for some of the most commonly used parameters associated with each task.
Generally, the HF-Inference API for a model uses the default pipeline settings associated with each task. But if you'd like to change the pipeline's default settings and specify additional inference parameters, you can configure the parameters directly through the model card metadata. Refer [here](https://huggingface.co/docs/inference-providers/detailed_parameters) for some of the most commonly used parameters associated with each task.

For example, if you want to specify an aggregation strategy for a NER task in the widget:

Expand All @@ -188,4 +188,6 @@ inference:
temperature: 0.7
```

The Serverless inference API allows you to send HTTP requests to models in the Hugging Face Hub programatically. ⚡⚡ Learn more about it by reading the [Inference API documentation](./models-inference). Finally, you can also deploy all those models to dedicated [Inference Endpoints](https://huggingface.co/docs/inference-endpoints).
Inference Providers allows you to send HTTP requests to models in the Hugging Face Hub programatically. It is an abstraction layer on top of External providers. ⚡⚡ Learn more about it by reading the [
Inference Providers documentation](/docs/inference-providers).
Finally, you can also deploy all those models to dedicated [Inference Endpoints](https://huggingface.co/docs/inference-endpoints).
2 changes: 1 addition & 1 deletion docs/hub/oauth.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ The currently supported scopes are:
- `read-repos`: Get read access to the user's personal repos.
- `write-repos`: Get write/read access to the user's personal repos.
- `manage-repos`: Get full access to the user's personal repos. Also grants repo creation and deletion.
- `inference-api`: Get access to the [Inference API](https://huggingface.co/docs/api-inference/index), you will be able to make inference requests on behalf of the user.
- `inference-api`: Get access to the [Inference API](https://huggingface.co/docs/inference-providers/index), you will be able to make inference requests on behalf of the user.
- `write-discussions`: Open discussions and Pull Requests on behalf of the user as well as interact with discussions (including reactions, posting/editing comments, closing discussions, ...). To open Pull Requests on private repos, you need to request the `read-repos` scope as well.

All other information is available in the [OpenID metadata](https://huggingface.co/.well-known/openid-configuration).
Expand Down
2 changes: 1 addition & 1 deletion docs/hub/spaces-oauth.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ Those scopes are optional and can be added by setting `hf_oauth_scopes` in your
- `read-repos`: Get read access to the user's personal repos.
- `write-repos`: Get write/read access to the user's personal repos.
- `manage-repos`: Get full access to the user's personal repos. Also grants repo creation and deletion.
- `inference-api`: Get access to the [Inference API](https://huggingface.co/docs/api-inference/index), you will be able to make inference requests on behalf of the user.
- `inference-api`: Get access to the [Inference API](https://huggingface.co/docs/inference-providers/index), you will be able to make inference requests on behalf of the user.
- `write-discussions`: Open discussions and Pull Requests on behalf of the user as well as interact with discussions (including reactions, posting/editing comments, closing discussions, ...). To open Pull Requests on private repos, you need to request the `read-repos` scope as well.

## Accessing organization resources
Expand Down
2 changes: 1 addition & 1 deletion docs/hub/spaces-sdks-docker-langfuse.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ Langfuse maintains native integrations with many popular LLM frameworks, includi

### Example 1: Trace Calls to HF Serverless API

As a simple example, here's how to trace LLM calls to the [HF Serverless API](https://huggingface.co/docs/api-inference/en/index) using the Langfuse Python SDK.
As a simple example, here's how to trace LLM calls to the [HF Serverless API](https://huggingface.co/docs/inference-providers/en/index) using the Langfuse Python SDK.

Be sure to first configure your `LANGFUSE_HOST`, `LANGFUSE_PUBLIC_KEY` and `LANGFUSE_SECRET_KEY` environment variables, and make sure you've [authenticated with your Hugging Face account](https://huggingface.co/docs/huggingface_hub/en/quick-start#authentication).

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
<!---
This markdown file has been generated from a script. Please do not edit it directly.
For more details, check out:
- the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/api-inference/scripts/generate.ts
- the task template defining the sections in the page: https://github.com/huggingface/hub-docs/tree/main/scripts/api-inference/templates/task/audio-classification.handlebars
- the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/inference-providers/scripts/generate.ts
- the task template defining the sections in the page: https://github.com/huggingface/hub-docs/tree/main/scripts/inference-providers/templates/task/audio-classification.handlebars
- the input jsonschema specifications used to generate the input markdown table: https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/tasks/audio-classification/spec/input.json
- the output jsonschema specifications used to generate the output markdown table: https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/tasks/audio-classification/spec/output.json
- the snippets used to generate the example:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
<!---
This markdown file has been generated from a script. Please do not edit it directly.
For more details, check out:
- the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/api-inference/scripts/generate.ts
- the task template defining the sections in the page: https://github.com/huggingface/hub-docs/tree/main/scripts/api-inference/templates/task/automatic-speech-recognition.handlebars
- the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/inference-providers/scripts/generate.ts
- the task template defining the sections in the page: https://github.com/huggingface/hub-docs/tree/main/scripts/inference-providers/templates/task/automatic-speech-recognition.handlebars
- the input jsonschema specifications used to generate the input markdown table: https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/tasks/automatic-speech-recognition/spec/input.json
- the output jsonschema specifications used to generate the output markdown table: https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/tasks/automatic-speech-recognition/spec/output.json
- the snippets used to generate the example:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
<!---
This markdown file has been generated from a script. Please do not edit it directly.
For more details, check out:
- the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/api-inference/scripts/generate.ts
- the task template defining the sections in the page: https://github.com/huggingface/hub-docs/tree/main/scripts/api-inference/templates/task/chat-completion.handlebars
- the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/inference-providers/scripts/generate.ts
- the task template defining the sections in the page: https://github.com/huggingface/hub-docs/tree/main/scripts/inference-providers/templates/task/chat-completion.handlebars
- the input jsonschema specifications used to generate the input markdown table: https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/tasks/chat-completion/spec/input.json
- the output jsonschema specifications used to generate the output markdown table: https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/tasks/chat-completion/spec/output.json
- the snippets used to generate the example:
Expand All @@ -15,7 +15,7 @@ For more details, check out:
## Chat Completion

Generate a response given a list of messages in a conversational context, supporting both conversational Language Models (LLMs) and conversational Vision-Language Models (VLMs).
This is a subtask of [`text-generation`](https://huggingface.co/docs/api-inference/tasks/text-generation) and [`image-text-to-text`](https://huggingface.co/docs/api-inference/tasks/image-text-to-text).
This is a subtask of [`text-generation`](https://huggingface.co/docs/inference-providers/tasks/text-generation) and [`image-text-to-text`](https://huggingface.co/docs/inference-providers/tasks/image-text-to-text).

### Recommended models

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
<!---
This markdown file has been generated from a script. Please do not edit it directly.
For more details, check out:
- the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/api-inference/scripts/generate.ts
- the task template defining the sections in the page: https://github.com/huggingface/hub-docs/tree/main/scripts/api-inference/templates/task/feature-extraction.handlebars
- the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/inference-providers/scripts/generate.ts
- the task template defining the sections in the page: https://github.com/huggingface/hub-docs/tree/main/scripts/inference-providers/templates/task/feature-extraction.handlebars
- the input jsonschema specifications used to generate the input markdown table: https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/tasks/feature-extraction/spec/input.json
- the output jsonschema specifications used to generate the output markdown table: https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/tasks/feature-extraction/spec/output.json
- the snippets used to generate the example:
Expand Down
Loading