Skip to content
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/inference-providers/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
title: How to use OpenAI gpt-oss
- local: guides/image-editor
title: Build an Image Editor
- local: guides/vscode
title: VS Code with GitHub Copilot

- local: tasks/index
title: Inference Tasks
Expand Down Expand Up @@ -106,4 +108,4 @@
title: Hub API

- local: register-as-a-provider
title: Register as an Inference Provider
title: Register as an Inference Provider
27 changes: 27 additions & 0 deletions docs/inference-providers/guides/vscode.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# 🤗 Hugging Face Inference Providers for VS Code Copilot

![Demo](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/inference-providers-guides/demo_vscode.gif)

Use frontier open LLMs like Kimi K2, DeepSeek V3.1, GLM 4.5 and more in VS Code with GitHub Copilot Chat powered by [Hugging Face Inference Providers](https://huggingface.co/docs/inference-providers/index) 🔥

## ⚡ Quick start

1. Install the HF Copilot Chat extension [here](https://marketplace.visualstudio.com/items?itemName=HuggingFace.huggingface-vscode-chat).
2. Open VS Code's chat interface.
3. Click the model picker and click "Manage Models...".
4. Select "Hugging Face" provider.
5. Enter your Hugging Face Token. You can get one from your [settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
5. Enter your Hugging Face Token. You can get one from your [settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained).
5. Enter your Hugging Face Token. You can get one from your [settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). You only need to give it the inference.serverless permissions.

(maybe, feel free to disregard)

6. Choose the models you want to add to the model picker. 🥳

## ✨ Why use the Hugging Face provider in Copilot

- Access [SoTA open‑source LLMs](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=cerebras,together,fireworks-ai,nebius,novita,sambanova,groq,hyperbolic,nscale,fal-ai,cohere,replicate,scaleway,black-forest-labs,ovhcloud&sort=trending) with tool calling capabilities.
- Single API to switch between multiple providers like Groq, Cerebras, Together AI, SambaNova, and more.
- Built for high availability (across providers) and low latency.
- Transparent pricing: what the provider charges is what you pay.

💡 The free Hugging Face user tier gives you a small amount of monthly inference credits to experiment. Upgrade to [Hugging Face PRO](https://huggingface.co/pro) or [Team or Enterprise](https://huggingface.co/enterprise) for $2 in monthly credits plus pay‑as‑you‑go access across all providers!

Check out the whole workflow in action in the video below:

<iframe width="560" height="315" src="https://www.youtube.com/embed/rqawpJhPhvM" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>