Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/inference-providers/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
title: Featherless AI
- local: providers/fireworks-ai
title: Fireworks
- local: providers/groq
title: Groq
- local: providers/hyperbolic
title: Hyperbolic
- local: providers/hf-inference
Expand Down
1 change: 1 addition & 0 deletions docs/inference-providers/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Here is the complete list of partners integrated with Inference Providers, and t
| [Fal AI](./providers/fal-ai) | | | | ✅ | ✅ |
| [Featherless AI](./providers/featherless-ai) | ✅ | | | | |
| [Fireworks](./providers/fireworks-ai) | ✅ | ✅ | | | |
| [Groq](./providers/groq) | ✅ | | | | |
| [HF Inference](./providers/hf-inference) | ✅ | ✅ | ✅ | ✅ | |
| [Hyperbolic](./providers/hyperbolic) | ✅ | ✅ | | | |
| [Nebius](./providers/nebius) | ✅ | ✅ | ✅ | ✅ | |
Expand Down
2 changes: 1 addition & 1 deletion docs/inference-providers/providers/cohere.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,6 @@ Find out more about Chat Completion (VLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=image-text-to-text
providersMapping={ {"cohere":{"modelId":"CohereLabs/aya-vision-32b","providerModelId":"c4ai-aya-vision-32b"} } }
providersMapping={ {"cohere":{"modelId":"CohereLabs/aya-vision-8b","providerModelId":"c4ai-aya-vision-8b"} } }
conversational />

69 changes: 69 additions & 0 deletions docs/inference-providers/providers/groq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
<!---
WARNING

This markdown file has been generated from a script. Please do not edit it directly.

### Template

If you want to update the content related to groq's description, please edit the template file under `https://github.com/huggingface/hub-docs/tree/main/scripts/inference-providers/templates/providers/groq.handlebars`.

### Logos

If you want to update groq's logo, upload a file by opening a PR on https://huggingface.co/datasets/huggingface/documentation-images/tree/main/inference-providers/logos. Ping @wauplin and @celinah on the PR to let them know you uploaded a new logo.
Logos must be in .png format and be named `groq-light.png` and `groq-dark.png`. Visit https://huggingface.co/settings/theme to switch between light and dark mode and check that the logos are displayed correctly.

### Generation script

For more details, check out the `generate.ts` script: https://github.com/huggingface/hub-docs/blob/main/scripts/inference-providers/scripts/generate.ts.
--->

# Groq

<div class="flex justify-center">
<a href="https://groq.com/" target="_blank">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've uploaded the logo you provided (groq-dark, groq-light)

<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/inference-providers/logos/groq-light.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/inference-providers/logos/groq-dark.png"/>
</a>
</div>

<div class="flex">
<a href="https://huggingface.co/groq" target="_blank">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/badges/resolve/main/follow-us-on-hf-lg.svg"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/badges/resolve/main/follow-us-on-hf-lg-dark.svg"/>
</a>
</div>

Groq is fast AI inference. Their groundbreaking LPU technology delivers record-setting performance and efficiency for GenAI models. With custom chips specifically designed for AI inference workloads and a deterministic, software-first approach, Groq eliminates the bottlenecks of conventional hardware to enable real-time AI applications with predictable latency and exceptional throughput so developers can build fast.

For latest pricing, visit our [pricing page](https://groq.com/pricing/).

## Resources
- **Website**: https://groq.com/
- **Documentation**: https://console.groq.com/docs
- **Community Forum**: https://community.groq.com/
- **X**: [@GroqInc](https://x.com/GroqInc)
- **LinkedIn**: [Groq](https://www.linkedin.com/company/groq/)
- **YouTube**: [Groq](https://www.youtube.com/@GroqInc)

## Supported tasks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this section will be automatically populated once Groq is live on hf.co



### Chat Completion (LLM)

Find out more about Chat Completion (LLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"groq":{"modelId":"Qwen/Qwen3-32B","providerModelId":"qwen/qwen3-32b"} } }
conversational />


### Chat Completion (VLM)

Find out more about Chat Completion (VLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=image-text-to-text
providersMapping={ {"groq":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"meta-llama/llama-4-scout-17b-16e-instruct"} } }
conversational />

81 changes: 32 additions & 49 deletions docs/inference-providers/providers/hf-inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,163 +38,146 @@ If you are interested in deploying models to a dedicated and autoscaling infrast

## Supported tasks


### Automatic Speech Recognition

Find out more about Automatic Speech Recognition [here](../tasks/automatic_speech_recognition).

<InferenceSnippet
pipeline=automatic-speech-recognition
providersMapping={ {"hf-inference":{"modelId":"openai/whisper-large-v3","providerModelId":"openai/whisper-large-v3"} } }
pipeline=automatic-speech-recognition
providersMapping={ {"hf-inference":{"modelId":"openai/whisper-large-v3","providerModelId":"openai/whisper-large-v3"} } }
/>


### Chat Completion (LLM)

Find out more about Chat Completion (LLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"hf-inference":{"modelId":"sarvamai/sarvam-m","providerModelId":"sarvamai/sarvam-m"} } }
pipeline=text-generation
providersMapping={ {"hf-inference":{"modelId":"sarvamai/sarvam-m","providerModelId":"sarvamai/sarvam-m"} } }
conversational />


### Chat Completion (VLM)

Find out more about Chat Completion (VLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=image-text-to-text
providersMapping={ {"hf-inference":{"modelId":"meta-llama/Llama-3.2-11B-Vision-Instruct","providerModelId":"meta-llama/Llama-3.2-11B-Vision-Instruct"} } }
pipeline=image-text-to-text
providersMapping={ {"hf-inference":{"modelId":"meta-llama/Llama-3.2-11B-Vision-Instruct","providerModelId":"meta-llama/Llama-3.2-11B-Vision-Instruct"} } }
conversational />


### Feature Extraction

Find out more about Feature Extraction [here](../tasks/feature_extraction).

<InferenceSnippet
pipeline=feature-extraction
providersMapping={ {"hf-inference":{"modelId":"intfloat/multilingual-e5-large-instruct","providerModelId":"intfloat/multilingual-e5-large-instruct"} } }
pipeline=feature-extraction
providersMapping={ {"hf-inference":{"modelId":"intfloat/multilingual-e5-large-instruct","providerModelId":"intfloat/multilingual-e5-large-instruct"} } }
/>


### Fill Mask

Find out more about Fill Mask [here](../tasks/fill_mask).

<InferenceSnippet
pipeline=fill-mask
providersMapping={ {"hf-inference":{"modelId":"google-bert/bert-base-uncased","providerModelId":"google-bert/bert-base-uncased"} } }
pipeline=fill-mask
providersMapping={ {"hf-inference":{"modelId":"google-bert/bert-base-uncased","providerModelId":"google-bert/bert-base-uncased"} } }
/>


### Image Classification

Find out more about Image Classification [here](../tasks/image_classification).

<InferenceSnippet
pipeline=image-classification
providersMapping={ {"hf-inference":{"modelId":"Falconsai/nsfw_image_detection","providerModelId":"Falconsai/nsfw_image_detection"} } }
pipeline=image-classification
providersMapping={ {"hf-inference":{"modelId":"Falconsai/nsfw_image_detection","providerModelId":"Falconsai/nsfw_image_detection"} } }
/>


### Image Segmentation

Find out more about Image Segmentation [here](../tasks/image_segmentation).

<InferenceSnippet
pipeline=image-segmentation
providersMapping={ {"hf-inference":{"modelId":"mattmdjaga/segformer_b2_clothes","providerModelId":"mattmdjaga/segformer_b2_clothes"} } }
pipeline=image-segmentation
providersMapping={ {"hf-inference":{"modelId":"mattmdjaga/segformer_b2_clothes","providerModelId":"mattmdjaga/segformer_b2_clothes"} } }
/>


### Object Detection

Find out more about Object Detection [here](../tasks/object_detection).

<InferenceSnippet
pipeline=object-detection
providersMapping={ {"hf-inference":{"modelId":"facebook/detr-resnet-50","providerModelId":"facebook/detr-resnet-50"} } }
pipeline=object-detection
providersMapping={ {"hf-inference":{"modelId":"facebook/detr-resnet-50","providerModelId":"facebook/detr-resnet-50"} } }
/>


### Question Answering

Find out more about Question Answering [here](../tasks/question_answering).

<InferenceSnippet
pipeline=question-answering
providersMapping={ {"hf-inference":{"modelId":"deepset/roberta-base-squad2","providerModelId":"deepset/roberta-base-squad2"} } }
pipeline=question-answering
providersMapping={ {"hf-inference":{"modelId":"deepset/roberta-base-squad2","providerModelId":"deepset/roberta-base-squad2"} } }
/>


### Summarization

Find out more about Summarization [here](../tasks/summarization).

<InferenceSnippet
pipeline=summarization
providersMapping={ {"hf-inference":{"modelId":"facebook/bart-large-cnn","providerModelId":"facebook/bart-large-cnn"} } }
pipeline=summarization
providersMapping={ {"hf-inference":{"modelId":"facebook/bart-large-cnn","providerModelId":"facebook/bart-large-cnn"} } }
/>


### Table Question Answering

Find out more about Table Question Answering [here](../tasks/table_question_answering).

<InferenceSnippet
pipeline=table-question-answering
providersMapping={ {"hf-inference":{"modelId":"google/tapas-base-finetuned-wtq","providerModelId":"google/tapas-base-finetuned-wtq"} } }
pipeline=table-question-answering
providersMapping={ {"hf-inference":{"modelId":"google/tapas-base-finetuned-wtq","providerModelId":"google/tapas-base-finetuned-wtq"} } }
/>


### Text Classification

Find out more about Text Classification [here](../tasks/text_classification).

<InferenceSnippet
pipeline=text-classification
providersMapping={ {"hf-inference":{"modelId":"tabularisai/multilingual-sentiment-analysis","providerModelId":"tabularisai/multilingual-sentiment-analysis"} } }
pipeline=text-classification
providersMapping={ {"hf-inference":{"modelId":"tabularisai/multilingual-sentiment-analysis","providerModelId":"tabularisai/multilingual-sentiment-analysis"} } }
/>


### Text Generation

Find out more about Text Generation [here](../tasks/text_generation).

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"hf-inference":{"modelId":"sarvamai/sarvam-m","providerModelId":"sarvamai/sarvam-m"} } }
pipeline=text-generation
providersMapping={ {"hf-inference":{"modelId":"sarvamai/sarvam-m","providerModelId":"sarvamai/sarvam-m"} } }
/>


### Text To Image

Find out more about Text To Image [here](../tasks/text_to_image).

<InferenceSnippet
pipeline=text-to-image
providersMapping={ {"hf-inference":{"modelId":"black-forest-labs/FLUX.1-dev","providerModelId":"black-forest-labs/FLUX.1-dev"} } }
pipeline=text-to-image
providersMapping={ {"hf-inference":{"modelId":"black-forest-labs/FLUX.1-dev","providerModelId":"black-forest-labs/FLUX.1-dev"} } }
/>


### Token Classification

Find out more about Token Classification [here](../tasks/token_classification).

<InferenceSnippet
pipeline=token-classification
providersMapping={ {"hf-inference":{"modelId":"dslim/bert-base-NER","providerModelId":"dslim/bert-base-NER"} } }
pipeline=token-classification
providersMapping={ {"hf-inference":{"modelId":"dslim/bert-base-NER","providerModelId":"dslim/bert-base-NER"} } }
/>


### Translation

Find out more about Translation [here](../tasks/translation).

<InferenceSnippet
pipeline=translation
providersMapping={ {"hf-inference":{"modelId":"google-t5/t5-base","providerModelId":"google-t5/t5-base"} } }
pipeline=translation
providersMapping={ {"hf-inference":{"modelId":"google-t5/t5-base","providerModelId":"google-t5/t5-base"} } }
/>

2 changes: 1 addition & 1 deletion docs/inference-providers/providers/nscale.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Find out more about Chat Completion (LLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"nscale":{"modelId":"Qwen/Qwen3-235B-A22B","providerModelId":"Qwen/Qwen3-235B-A22B"} } }
providersMapping={ {"nscale":{"modelId":"meta-llama/Llama-3.1-8B-Instruct","providerModelId":"meta-llama/Llama-3.1-8B-Instruct"} } }
conversational />


Expand Down
2 changes: 1 addition & 1 deletion docs/inference-providers/providers/replicate.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,6 @@ Find out more about Text To Video [here](../tasks/text_to_video).

<InferenceSnippet
pipeline=text-to-video
providersMapping={ {"replicate":{"modelId":"Wan-AI/Wan2.1-T2V-14B","providerModelId":"wavespeedai/wan-2.1-t2v-480p"} } }
providersMapping={ {"replicate":{"modelId":"Lightricks/LTX-Video","providerModelId":"lightricks/ltx-video:8c47da666861d081eeb4d1261853087de23923a268a69b63febdf5dc1dee08e4"} } }
/>

4 changes: 2 additions & 2 deletions docs/inference-providers/providers/together.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Find out more about Chat Completion (LLM) [here](../tasks/chat-completion).

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"together":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek-ai/DeepSeek-R1"} } }
providersMapping={ {"together":{"modelId":"deepseek-ai/DeepSeek-R1","providerModelId":"deepseek-ai/DeepSeek-R1"} } }
conversational />


Expand All @@ -64,7 +64,7 @@ Find out more about Text Generation [here](../tasks/text_generation).

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"together":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek-ai/DeepSeek-R1"} } }
providersMapping={ {"together":{"modelId":"deepseek-ai/DeepSeek-R1","providerModelId":"deepseek-ai/DeepSeek-R1"} } }
/>


Expand Down
5 changes: 3 additions & 2 deletions docs/inference-providers/tasks/chat-completion.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ This is a subtask of [`text-generation`](https://huggingface.co/docs/inference-p
- [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B): Smaller variant of one of the most powerful models.
- [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct): Very powerful text generation model trained to follow instructions.
- [microsoft/phi-4](https://huggingface.co/microsoft/phi-4): Powerful text generation model by Microsoft.
- [Qwen/Qwen2.5-7B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-1M): Strong conversational model that supports very long instructions.
- [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct): Text generation model used to write code.
- [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1): Powerful reasoning based open large language model.

Expand Down Expand Up @@ -60,7 +61,7 @@ The API supports:

<InferenceSnippet
pipeline=text-generation
providersMapping={ {"cerebras":{"modelId":"Qwen/Qwen3-32B","providerModelId":"qwen-3-32b"},"cohere":{"modelId":"CohereLabs/c4ai-command-r-plus","providerModelId":"command-r-plus-04-2024"},"featherless-ai":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek-ai/DeepSeek-R1-0528"},"fireworks-ai":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"accounts/fireworks/models/deepseek-r1-0528"},"hf-inference":{"modelId":"sarvamai/sarvam-m","providerModelId":"sarvamai/sarvam-m"},"hyperbolic":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek-ai/DeepSeek-R1-0528"},"nebius":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek-ai/DeepSeek-R1-0528"},"novita":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek/deepseek-r1-0528"},"nscale":{"modelId":"Qwen/Qwen3-235B-A22B","providerModelId":"Qwen/Qwen3-235B-A22B"},"sambanova":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"DeepSeek-R1-0528"},"together":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek-ai/DeepSeek-R1"}} }
providersMapping={ {"cerebras":{"modelId":"Qwen/Qwen3-32B","providerModelId":"qwen-3-32b"},"cohere":{"modelId":"CohereLabs/c4ai-command-r-plus","providerModelId":"command-r-plus-04-2024"},"featherless-ai":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek-ai/DeepSeek-R1-0528"},"fireworks-ai":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"accounts/fireworks/models/deepseek-r1-0528"},"groq":{"modelId":"Qwen/Qwen3-32B","providerModelId":"qwen/qwen3-32b"},"hf-inference":{"modelId":"sarvamai/sarvam-m","providerModelId":"sarvamai/sarvam-m"},"hyperbolic":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek-ai/DeepSeek-R1-0528"},"nebius":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek-ai/DeepSeek-R1-0528"},"novita":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"deepseek/deepseek-r1-0528"},"nscale":{"modelId":"meta-llama/Llama-3.1-8B-Instruct","providerModelId":"meta-llama/Llama-3.1-8B-Instruct"},"sambanova":{"modelId":"deepseek-ai/DeepSeek-R1-0528","providerModelId":"DeepSeek-R1-0528"},"together":{"modelId":"deepseek-ai/DeepSeek-R1","providerModelId":"deepseek-ai/DeepSeek-R1"}} }
conversational />


Expand All @@ -70,7 +71,7 @@ conversational />

<InferenceSnippet
pipeline=image-text-to-text
providersMapping={ {"cerebras":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"llama-4-scout-17b-16e-instruct"},"cohere":{"modelId":"CohereLabs/aya-vision-32b","providerModelId":"c4ai-aya-vision-32b"},"featherless-ai":{"modelId":"allura-org/Gemma-3-Glitter-27B","providerModelId":"allura-org/Gemma-3-Glitter-27B"},"fireworks-ai":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"accounts/fireworks/models/llama4-scout-instruct-basic"},"hf-inference":{"modelId":"meta-llama/Llama-3.2-11B-Vision-Instruct","providerModelId":"meta-llama/Llama-3.2-11B-Vision-Instruct"},"hyperbolic":{"modelId":"Qwen/Qwen2.5-VL-7B-Instruct","providerModelId":"Qwen/Qwen2.5-VL-7B-Instruct"},"nebius":{"modelId":"google/gemma-3-27b-it","providerModelId":"google/gemma-3-27b-it-fast"},"novita":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"meta-llama/llama-4-scout-17b-16e-instruct"},"nscale":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct"},"sambanova":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"Llama-4-Scout-17B-16E-Instruct"},"together":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct"}} }
providersMapping={ {"cerebras":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"llama-4-scout-17b-16e-instruct"},"cohere":{"modelId":"CohereLabs/aya-vision-8b","providerModelId":"c4ai-aya-vision-8b"},"featherless-ai":{"modelId":"google/gemma-3-27b-it","providerModelId":"google/gemma-3-27b-it"},"fireworks-ai":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"accounts/fireworks/models/llama4-scout-instruct-basic"},"groq":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"meta-llama/llama-4-scout-17b-16e-instruct"},"hf-inference":{"modelId":"meta-llama/Llama-3.2-11B-Vision-Instruct","providerModelId":"meta-llama/Llama-3.2-11B-Vision-Instruct"},"hyperbolic":{"modelId":"Qwen/Qwen2.5-VL-7B-Instruct","providerModelId":"Qwen/Qwen2.5-VL-7B-Instruct"},"nebius":{"modelId":"google/gemma-3-27b-it","providerModelId":"google/gemma-3-27b-it-fast"},"novita":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"meta-llama/llama-4-scout-17b-16e-instruct"},"nscale":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct"},"sambanova":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"Llama-4-Scout-17B-16E-Instruct"},"together":{"modelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct","providerModelId":"meta-llama/Llama-4-Scout-17B-16E-Instruct"}} }
conversational />


Expand Down
Loading