docs: add Hugging Face provider documentation (#281)

hannesrudolph · web-flow · commit 2045733ef686 · 2025-07-29T11:24:20.000-06:00
diff --git a/docs/providers/huggingface.md b/docs/providers/huggingface.md
@@ -0,0 +1,102 @@
+---
+sidebar_label: Hugging Face
+description: Connect Roo Code to Hugging Face's inference router for access to open-source LLMs. Choose from multiple inference providers and models like Llama, Mistral, and more.
+keywords:
+  - hugging face
+  - huggingface
+  - roo code
+  - api provider
+  - open source models
+  - llama
+  - mistral
+  - inference router
+  - ai models
+  - inference providers
+image: /img/social-share.jpg
+---
+
+# Using Hugging Face With Roo Code
+
+Roo Code integrates with the Hugging Face router to provide access to a curated collection of open-source models optimized for code assistance. The integration allows you to choose from multiple inference providers and automatically selects the best available option.
+
+**Website:** [https://huggingface.co/](https://huggingface.co/)
+
+---
+
+## Getting an API Key
+
+1. **Sign Up/Sign In:** Go to [Hugging Face](https://huggingface.co/) and create an account or sign in.
+2. **Navigate to Settings:** Click on your profile picture and select "Settings".
+3. **Access Tokens:** Go to the "Access Tokens" section in your settings.
+4. **Create Token:** Click "New token" and give it a descriptive name (e.g., "Roo Code").
+5. **Set Permissions:** Select "Read" permissions (this is sufficient for Roo Code).
+6. **Copy Token:** **Important:** Copy the token immediately. Store it securely.
+
+---
+
+## Supported Models
+
+Roo Code displays models from the 'roocode' collection on Hugging Face, which includes curated open-source models optimized for code assistance. The default model is `meta-llama/Llama-3.3-70B-Instruct` if no model is selected.
+
+Available models are dynamically retrieved from the Hugging Face API. The exact list of models may vary based on availability. Both the model and provider dropdowns are searchable, allowing you to quickly find specific options.
+
+---
+
+## Configuration in Roo Code
+
+1. **Open Roo Code Settings:** Click the gear icon (<Codicon name="gear" />) in the Roo Code panel.
+2. **Select Provider:** Choose "Hugging Face" from the "API Provider" dropdown.
+3. **Enter API Key:** Paste your Hugging Face API token into the "Hugging Face API Key" field.
+4. **Select Model:** Choose your desired model from the "Model" dropdown. The dropdown shows the model count and is searchable.
+5. **Choose Inference Provider (Optional):** Select a specific inference provider from the dropdown, or leave it on "Auto" (default) to automatically select the best available provider.
+
+---
+
+## Inference Provider Selection
+
+Hugging Face's router connects to multiple inference providers. You can either:
+
+- **Auto Mode (Default):** Automatically selects the best available provider based on model availability and performance
+- **Manual Selection:** Choose a specific provider from the dropdown
+
+The dropdown displays the status of each provider:
+- `live` - Provider is operational and available
+- `staging` - Provider is in testing phase
+- `error` - Provider is currently experiencing issues
+
+Provider names are formatted for better readability in the UI (e.g., "sambanova" appears as "SambaNova").
+
+When you select a specific provider, the model capabilities (max tokens, pricing) will update to reflect that provider's specific configuration. Pricing information is only displayed when a specific provider is selected, not in Auto mode.
+
+---
+
+## Model Information Display
+
+For each selected model, Roo Code displays:
+
+- **Max Output:** The maximum number of tokens the model can generate (varies by provider)
+- **Pricing:** Cost per million input and output tokens (displayed only when a specific provider is selected)
+- **Image Support:** Currently, all models are shown as text-only. This is a Roo Code implementation limitation, not a restriction of the Hugging Face API.
+
+---
+
+## Available Providers
+
+The list of available providers is dynamic and retrieved from the Hugging Face API. Common providers include:
+
+- **Together AI** - High-performance inference platform
+- **Fireworks AI** - Fast and scalable model serving
+- **DeepInfra** - Cost-effective GPU infrastructure
+- **Hyperbolic** - Optimized inference service
+- **Cerebras** - Hardware-accelerated inference
+
+*Note: The providers shown above are examples of commonly available options. The actual list may vary.*
+
+---
+
+## Tips and Notes
+
+- **Provider Failover:** When using Auto mode, if the selected provider fails, Hugging Face's infrastructure will automatically try alternative providers
+- **Rate Limits:** Different providers may have different rate limits and availability
+- **Pricing Variability:** Costs can vary significantly between providers for the same model
+- **Model Updates:** The roocode collection is regularly updated with new and improved models
diff --git a/docs/tips-and-tricks.md b/docs/tips-and-tricks.md
@@ -27,4 +27,4 @@ A collection of quick tips to help you get the most out of Roo Code.
 - To manage large files and reduce context/resource usage, adjust the `File read auto-truncate threshold` setting. This setting controls the number of lines read from a file in one batch. Lower values can improve performance when working with very large files, but may require more read operations. You can find this setting in the Roo Code settings under 'Advanced Settings'.
 - Set up a keyboard shortcut for the [`roo.acceptInput` command](/features/keyboard-shortcuts) to accept suggestions or submit text input without using the mouse. Perfect for keyboard-focused workflows and reducing hand strain.
 - Use **Sticky Models** to assign specialized AI models to different modes (reasoning model for planning, non-reasoning model for coding). Roo automatically switches to each mode's last-used model without manual selection.
-- Customize the [context reduction prompt](/features/intelligent-context-condensing#customizing-the-context-reduction-prompt) if you find that for your domain/use case it forgets particular things. You can instruct it to preserve specific types of information that are critical to your workflow.
+- Customize the [context reduction prompt](/features/intelligent-context-condensing#customizing-the-context-condensing-prompt) if you find that for your domain/use case it forgets particular things. You can instruct it to preserve specific types of information that are critical to your workflow.
diff --git a/docs/update-notes/v3.16.6.mdx b/docs/update-notes/v3.16.6.mdx
diff --git a/docs/update-notes/v3.24.0.mdx b/docs/update-notes/v3.24.0.mdx
@@ -29,7 +29,7 @@ We've added support for Hugging Face as a new provider, bringing access to thous
 - **Flexible Integration**: Use models hosted on Hugging Face's infrastructure
 - **Easy Configuration**: Simple setup process to get started with your preferred models and providers
 
-This opens up Roo Code to the entire Hugging Face ecosystem of open source AI models.
+This opens up Roo Code to the entire Hugging Face ecosystem of open source AI models. See our [Hugging Face provider documentation](/providers/huggingface) for setup instructions.
 
 ## Diagnostic Controls
 
diff --git a/sidebars.ts b/sidebars.ts
@@ -160,6 +160,7 @@ const sidebars: SidebarsConfig = {
         'providers/gemini',
         'providers/glama',
         'providers/groq',
+        'providers/huggingface',
         'providers/human-relay',
         'providers/lmstudio',
         'providers/litellm',