Skip to content
Merged
27 changes: 25 additions & 2 deletions docs/inference-providers/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,33 @@
title: Pricing and Billing
- local: hub-integration
title: Hub integration
- local: security
title: Security
- local: register-as-a-provider
title: Register as an Inference Provider
- local: security
title: Security

- title: Providers
sections:
- local: providers/cerebras
title: Cerebras
- local: providers/fal-ai
title: Fal AI
- local: providers/fireworks-ai
title: Fireworks
- local: providers/hyperbolic
title: Hyperbolic
- local: providers/hf-inference
title: HF Inference
- local: providers/nebius
title: Nebius
- local: providers/novita
title: Novita
- local: providers/replicate
title: Replicate
- local: providers/sambanova
title: SambaNova
- local: providers/together
title: Together
- title: API Reference
sections:
- local: tasks/index
Expand Down
18 changes: 17 additions & 1 deletion docs/inference-providers/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,23 @@ Hugging Face Inference Providers simplify and unify how developers access and ru

To learn more about the launch of Inference Providers, check out our [announcement blog post](https://huggingface.co/blog/inference-providers).

## Partners

Here is the complete list of partners integrated with Inference Providers, and the supported tasks for each of them:

| Provider | Chat completion (LLM) | Chat completion (VLM) | Feature Extraction | Text to Image | Text to video |
| ---------------------------------------- | :-------------------: | :-------------------: | :----------------: | :-----------: | :-----------: |
| [Cerebras](./providers/cerebras) | ✅ | | | | |
| [Fal AI](./providers/fal-ai) | | | | ✅ | ✅ |
| [Fireworks](./providers/fireworks-ai) | ✅ | ✅ | | | |
| [HF Inference](./providers/hf-inference) | ✅ | ✅ | ✅ | ✅ | |
| [Hyperbolic](./providers/hyperbolic) | ✅ | ✅ | | | |
| [Nebius](./providers/nebius) | ✅ | ✅ | | ✅ | |
| [Novita](./providers/novita) | ✅ | ✅ | | | ✅ |
| [Replicate](./providers/replicate) | | | | ✅ | ✅ |
| [SambaNova](./providers/sambanova) | ✅ | | ✅ | | |
| [Together](./providers/together) | ✅ | ✅ | | ✅ | |

## Why use Inference Providers?

Inference Providers offers a fast and simple way to explore thousands of models for a variety of tasks. Whether you're experimenting with ML capabilities or building a new application, this API gives you instant access to high-performing models across multiple domains:
Expand All @@ -28,7 +45,6 @@ Inference Providers offers a fast and simple way to explore thousands of models
- **🔧 Developer-Friendly**: Simple requests, fast responses, and a consistent developer experience across Python and JavaScript clients.
- **💰 Cost-Effective**: No extra markup on provider rates.


## Inference Playground

To get started quickly with [Chat Completion models](http://huggingface.co/models?inference_provider=all&sort=trending&other=conversational), use the [Inference Playground](https://huggingface.co/playground) to easily test and compare models with your prompts.
Expand Down
30 changes: 30 additions & 0 deletions docs/inference-providers/providers/cerebras.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
<!---
WARNING

This markdown file has been generated from a script. Please do not edit it directly.

If you want to update the content related to cerebras's description, please edit the template file under `https://github.com/huggingface/hub-docs/tree/main/scripts/inference-providers/templates/providers/cerebras.handlebars`.

For more details, check out the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/inference-providers/scripts/generate.ts.
--->

# Cerebras

[![Cerebras Logo](https://upload.wikimedia.org/wikipedia/commons/thumb/1/15/Cerebras_logo.svg/512px-Cerebras_logo.svg.png)](https://www.cerebras.ai/)

[![Follow us on Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/follow-us-on-hf-lg.svg)](https://huggingface.co/cerebras)

Cerebras stands alone as the world’s fastest AI inference and training platform. Organizations across fields like medical research, cryptography, energy, and agentic AI use our CS-2 and CS-3 systems to build on-premise supercomputers, while developers and enterprises everywhere can access the power of Cerebras through our pay-as-you-go cloud offerings.

## Supported tasks


### Chat Completion (LLM)

Find out more about Chat Completion (LLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"cerebras":{"modelId":"meta-llama/Llama-3.3-70B-Instruct","providerModelId":"llama-3.3-70b"} } }
conversational />

50 changes: 50 additions & 0 deletions docs/inference-providers/providers/fal-ai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
<!---
WARNING

This markdown file has been generated from a script. Please do not edit it directly.

If you want to update the content related to fal-ai's description, please edit the template file under `https://github.com/huggingface/hub-docs/tree/main/scripts/inference-providers/templates/providers/fal-ai.handlebars`.

For more details, check out the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/inference-providers/scripts/generate.ts.
--->

# Fal

[![fal.ai logo](https://images.seeklogo.com/logo-png/61/1/fal-ai-logo-png_seeklogo-611592.png)](https://fal.ai/)

[![Follow us on Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/follow-us-on-hf-lg.svg)](https://huggingface.co/fal)

Founded in 2021 by [Burkay Gur](https://huggingface.co/burkaygur) and [Gorkem Yurtseven](https://huggingface.co/gorkemyurt), fal.ai was born out of a shared passion for AI and a desire to address the challenges in AI infrastructure observed during their tenures at Coinbase and Amazon.

## Supported tasks


### Automatic Speech Recognition

Find out more about Automatic Speech Recognition [here](../tasks/automatic_speech_recognition).

<InferenceSnippet
pipeline=automatic-speech-recognition
providersMapping={ {"fal-ai":{"modelId":"openai/whisper-large-v3","providerModelId":"fal-ai/whisper"} } }
/>


### Text To Image

Find out more about Text To Image [here](../tasks/text_to_image).

<InferenceSnippet
pipeline=text-to-image
providersMapping={ {"fal-ai":{"modelId":"black-forest-labs/FLUX.1-dev","providerModelId":"fal-ai/flux/dev"} } }
/>


### Text To Video

Find out more about Text To Video [here](../tasks/text_to_video).

<InferenceSnippet
pipeline=text-to-video
providersMapping={ {"fal-ai":{"modelId":"Wan-AI/Wan2.1-T2V-14B","providerModelId":"fal-ai/wan-t2v"} } }
/>

39 changes: 39 additions & 0 deletions docs/inference-providers/providers/fireworks-ai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
<!---
WARNING

This markdown file has been generated from a script. Please do not edit it directly.

If you want to update the content related to fireworks-ai's description, please edit the template file under `https://github.com/huggingface/hub-docs/tree/main/scripts/inference-providers/templates/providers/fireworks-ai.handlebars`.

For more details, check out the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/inference-providers/scripts/generate.ts.
--->

# Fireworks AI

[![fireworks.ai](https://d1.awsstatic.com/fireworks-ai-wordmark-color-dark.93b1f27fdf77899fa02afb949fb27317ee4081ad.png)](https://fireworks.ai/)

[![Follow us on Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/follow-us-on-hf-lg.svg)](https://huggingface.co/fireworks-ai)

Fireworks AI is a developer-centric platform that delivers high-performance generative AI solutions, enabling efficient deployment and fine-tuning of large language models (LLMs) and image models.
## Supported tasks


### Chat Completion (LLM)

Find out more about Chat Completion (LLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"fireworks-ai":{"modelId":"deepseek-ai/DeepSeek-V3-0324","providerModelId":"accounts/fireworks/models/deepseek-v3-0324"} } }
conversational />


### Chat Completion (VLM)

Find out more about Chat Completion (VLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=image-text-to-text
providersMapping={ {"fireworks-ai":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"accounts/fireworks/models/llama4-scout-instruct-basic"} } }
conversational />

201 changes: 201 additions & 0 deletions docs/inference-providers/providers/hf-inference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
<!---
WARNING

This markdown file has been generated from a script. Please do not edit it directly.

If you want to update the content related to hf-inference's description, please edit the template file under `https://github.com/huggingface/hub-docs/tree/main/scripts/inference-providers/templates/providers/hf-inference.handlebars`.

For more details, check out the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/inference-providers/scripts/generate.ts.
--->

# HF Inference

[![Hugging Face](https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo-with-title.png)](https://huggingface.co/)

[![Follow us on Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/follow-us-on-hf-lg.svg)](https://huggingface.co/hf-inference)

HF Inference is the serverless Inference API powered by Hugging Face. This service used to be called "Inference API (serverless)" prior to Inference Providers.
If you are interested in deploying models to a dedicated and autoscaling infrastructure managed by Hugging Face, check out [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) instead.

## Supported tasks


### Audio Classification

Find out more about Audio Classification [here](../tasks/audio_classification).

<InferenceSnippet
pipeline=audio-classification
providersMapping={ {"hf-inference":{"modelId":"ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition","providerModelId":"ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition"} } }
/>


### Automatic Speech Recognition

Find out more about Automatic Speech Recognition [here](../tasks/automatic_speech_recognition).

<InferenceSnippet
pipeline=automatic-speech-recognition
providersMapping={ {"hf-inference":{"modelId":"openai/whisper-large-v3-turbo","providerModelId":"openai/whisper-large-v3-turbo"} } }
/>


### Chat Completion (LLM)

Find out more about Chat Completion (LLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"hf-inference":{"modelId":"Qwen/QwQ-32B","providerModelId":"Qwen/QwQ-32B"} } }
conversational />


### Chat Completion (VLM)

Find out more about Chat Completion (VLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=image-text-to-text
providersMapping={ {"hf-inference":{"modelId":"google/gemma-3-27b-it","providerModelId":"google/gemma-3-27b-it"} } }
conversational />


### Feature Extraction

Find out more about Feature Extraction [here](../tasks/feature_extraction).

<InferenceSnippet
pipeline=feature-extraction
providersMapping={ {"hf-inference":{"modelId":"intfloat/multilingual-e5-large-instruct","providerModelId":"intfloat/multilingual-e5-large-instruct"} } }
/>


### Fill Mask

Find out more about Fill Mask [here](../tasks/fill_mask).

<InferenceSnippet
pipeline=fill-mask
providersMapping={ {"hf-inference":{"modelId":"google-bert/bert-base-uncased","providerModelId":"google-bert/bert-base-uncased"} } }
/>


### Image Classification

Find out more about Image Classification [here](../tasks/image_classification).

<InferenceSnippet
pipeline=image-classification
providersMapping={ {"hf-inference":{"modelId":"Falconsai/nsfw_image_detection","providerModelId":"Falconsai/nsfw_image_detection"} } }
/>


### Image To Image

Find out more about Image To Image [here](../tasks/image_to_image).

<InferenceSnippet
pipeline=image-to-image
providersMapping={ {"hf-inference":{"modelId":"enhanceaiteam/Flux-Uncensored-V2","providerModelId":"black-forest-labs/FLUX.1-dev"} } }
/>


### Object Detection

Find out more about Object Detection [here](../tasks/object_detection).

<InferenceSnippet
pipeline=object-detection
providersMapping={ {"hf-inference":{"modelId":"facebook/detr-resnet-50","providerModelId":"facebook/detr-resnet-50"} } }
/>


### Question Answering

Find out more about Question Answering [here](../tasks/question_answering).

<InferenceSnippet
pipeline=question-answering
providersMapping={ {"hf-inference":{"modelId":"deepset/gelectra-large-germanquad","providerModelId":"deepset/gelectra-large-germanquad"} } }
/>


### Summarization

Find out more about Summarization [here](../tasks/summarization).

<InferenceSnippet
pipeline=summarization
providersMapping={ {"hf-inference":{"modelId":"facebook/bart-large-cnn","providerModelId":"facebook/bart-large-cnn"} } }
/>


### Text Classification

Find out more about Text Classification [here](../tasks/text_classification).

<InferenceSnippet
pipeline=text-classification
providersMapping={ {"hf-inference":{"modelId":"ProsusAI/finbert","providerModelId":"ProsusAI/finbert"} } }
/>


### Text Generation

Find out more about Text Generation [here](../tasks/text_generation).

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"hf-inference":{"modelId":"Qwen/QwQ-32B","providerModelId":"Qwen/QwQ-32B"} } }
/>


### Text To Image

Find out more about Text To Image [here](../tasks/text_to_image).

<InferenceSnippet
pipeline=text-to-image
providersMapping={ {"hf-inference":{"modelId":"black-forest-labs/FLUX.1-dev","providerModelId":"black-forest-labs/FLUX.1-dev"} } }
/>


### Text To Video

Find out more about Text To Video [here](../tasks/text_to_video).

<InferenceSnippet
pipeline=text-to-video
providersMapping={ {"hf-inference":{"modelId":"AdamLucek/Wan2.1-T2V-14B-OldBookIllustrations","providerModelId":"black-forest-labs/FLUX.1-dev"} } }
/>


### Token Classification

Find out more about Token Classification [here](../tasks/token_classification).

<InferenceSnippet
pipeline=token-classification
providersMapping={ {"hf-inference":{"modelId":"dbmdz/bert-large-cased-finetuned-conll03-english","providerModelId":"dbmdz/bert-large-cased-finetuned-conll03-english"} } }
/>


### Translation

Find out more about Translation [here](../tasks/translation).

<InferenceSnippet
pipeline=translation
providersMapping={ {"hf-inference":{"modelId":"facebook/nllb-200-distilled-600M","providerModelId":"facebook/nllb-200-distilled-600M"} } }
/>


### Zero Shot Classification

Find out more about Zero Shot Classification [here](../tasks/zero_shot_classification).

<InferenceSnippet
pipeline=zero-shot-classification
providersMapping={ {"hf-inference":{"modelId":"facebook/bart-large-mnli","providerModelId":"facebook/bart-large-mnli"} } }
/>

Loading