Skip to content

Commit 3ad6123

Browse files
Add release notes for v3.26.7 (#333)
* Add release notes for v3.26.7 - Added Kimi K2-0905 models with 256K context window - Added OpenAI service tiers (Standard/Flex/Priority) - Added DeepInfra provider with 100+ models - Fixed multiple bugs including MCP validation, zsh command safety - Updated combined v3.26 notes with new features * docs: add DeepInfra provider extraction report * docs: add DeepInfra provider documentation * chore: remove temporary extraction notes * docs: add DeepInfra to providers sidebar menu --------- Co-authored-by: Roo Code <[email protected]>
1 parent ffa1ec5 commit 3ad6123

File tree

5 files changed

+187
-0
lines changed

5 files changed

+187
-0
lines changed

docs/providers/deepinfra.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
---
2+
sidebar_label: DeepInfra
3+
description: Configure DeepInfra's high-performance AI models in Roo Code. Access Qwen Coder, Llama, and other open-source models with prompt caching and vision capabilities.
4+
keywords:
5+
- deepinfra
6+
- deep infra
7+
- roo code
8+
- api provider
9+
- qwen coder
10+
- llama models
11+
- prompt caching
12+
- vision models
13+
- open source ai
14+
image: /img/social-share.jpg
15+
---
16+
17+
# Using DeepInfra With Roo Code
18+
19+
DeepInfra provides cost-effective access to high-performance open-source models with features like prompt caching, vision support, and specialized coding models. Their infrastructure offers low latency and automatic load balancing across global edge locations.
20+
21+
**Website:** [https://deepinfra.com/](https://deepinfra.com/)
22+
23+
---
24+
25+
## Getting an API Key
26+
27+
1. **Sign Up/Sign In:** Go to [DeepInfra](https://deepinfra.com/). Create an account or sign in.
28+
2. **Navigate to API Keys:** Access the API keys section in your dashboard.
29+
3. **Create a Key:** Generate a new API key. Give it a descriptive name (e.g., "Roo Code").
30+
4. **Copy the Key:** **Important:** Copy the API key immediately. Store it securely.
31+
32+
---
33+
34+
## Supported Models
35+
36+
Roo Code dynamically fetches available models from DeepInfra's API. The default model is:
37+
38+
* `Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo` (256K context, optimized for coding)
39+
40+
Common models available include:
41+
42+
* **Coding Models:** Qwen Coder series, specialized for programming tasks
43+
* **General Models:** Llama 3.1, Mixtral, and other open-source models
44+
* **Vision Models:** Models with image understanding capabilities
45+
* **Reasoning Models:** Models with advanced reasoning support
46+
47+
Browse the full catalog at [deepinfra.com/models](https://deepinfra.com/models).
48+
49+
---
50+
51+
## Configuration in Roo Code
52+
53+
1. **Open Roo Code Settings:** Click the gear icon (<Codicon name="gear" />) in the Roo Code panel.
54+
2. **Select Provider:** Choose "DeepInfra" from the "API Provider" dropdown.
55+
3. **Enter API Key:** Paste your DeepInfra API key into the "DeepInfra API Key" field.
56+
4. **Select Model:** Choose your desired model from the "Model" dropdown.
57+
- Models will auto-populate after entering a valid API key
58+
- Click "Refresh Models" to update the list
59+
60+
---
61+
62+
## Advanced Features
63+
64+
### Prompt Caching
65+
66+
DeepInfra supports prompt caching for eligible models, which:
67+
- Reduces costs for repeated contexts
68+
- Improves response times for similar queries
69+
- Automatically manages cache based on task IDs
70+
71+
### Vision Support
72+
73+
Models with vision capabilities can:
74+
- Process images alongside text
75+
- Understand visual content for coding tasks
76+
- Analyze screenshots and diagrams
77+
78+
### Custom Base URL
79+
80+
For enterprise deployments, you can configure a custom base URL in the advanced settings.
81+
82+
---
83+
84+
## Tips and Notes
85+
86+
* **Performance:** DeepInfra offers low latency with automatic load balancing across global locations.
87+
* **Cost Efficiency:** Competitive pricing with prompt caching to reduce costs for repeated contexts.
88+
* **Model Variety:** Access to the latest open-source models including specialized coding models.
89+
* **Context Windows:** Models support context windows up to 256K tokens for large codebases.
90+
* **Pricing:** Pay-per-use model with no minimums. Check [deepinfra.com](https://deepinfra.com/) for current pricing.

docs/update-notes/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ image: /img/social-share.jpg
1919

2020
### Version 3.26
2121

22+
* [3.26.7](/update-notes/v3.26.7) (2025-09-05)
2223
* [3.26.6](/update-notes/v3.26.6) (2025-09-03)
2324
* [3.26.5](/update-notes/v3.26.5) (2025-09-03)
2425
* [3.26.4](/update-notes/v3.26.4) (2025-09-01)

docs/update-notes/v3.26.7.mdx

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
---
2+
description: Enhanced Kimi K2 models with 256K+ context windows, OpenAI service tiers for flexible pricing, and DeepInfra as a new provider with 100+ models.
3+
keywords:
4+
- roo code 3.26.7
5+
- kimi k2 models
6+
- openai service tiers
7+
- deepinfra provider
8+
- bug fixes
9+
image: /img/social-share.jpg
10+
---
11+
12+
# Roo Code 3.26.7 Release Notes (2025-09-05)
13+
14+
This release brings enhanced Kimi K2 models with massive context windows, OpenAI service tier selection, and DeepInfra as a new provider offering 100+ models.
15+
16+
## Kimi K2-0905: Moonshot's Latest Open Source Model is Live in Roo Code
17+
18+
We've upgraded to the latest Kimi K2-0905 models across multiple providers (thanks CellenLee!) ([#7663](https://github.com/RooCodeInc/Roo-Code/pull/7663), [#7693](https://github.com/RooCodeInc/Roo-Code/pull/7693)):
19+
20+
K2-0905 comes with three major upgrades:
21+
- **256K Context Window**: Massive context supporting up to 256K-262K tokens, doubling the previous limit for processing much larger documents and conversations
22+
- **Improved Tool Calling**: Enhanced function calling and tool use capabilities for better agentic workflows
23+
- **Enhanced Front-end Development**: Superior HTML, CSS, and JavaScript generation with modern framework support
24+
25+
Available through Groq, Moonshot, and Fireworks providers. These models excel at handling large codebases, long conversations, and complex multi-file operations.
26+
27+
## OpenAI Service Tiers
28+
29+
We've added support for OpenAI's new Responses API service tiers ([#7646](https://github.com/RooCodeInc/Roo-Code/pull/7646)):
30+
31+
- **Standard Tier**: Default tier with regular pricing
32+
- **Flex Tier**: 50% discount with slightly longer response times for non-urgent tasks
33+
- **Priority Tier**: Faster response times for time-critical operations
34+
35+
Select your preferred tier directly in the UI based on your needs and budget. This gives you more control over costs while maintaining access to OpenAI's powerful models.
36+
37+
> **📚 Documentation**: See [OpenAI Provider Guide](/providers/openai) for detailed tier comparison and pricing.
38+
39+
## DeepInfra Provider
40+
41+
DeepInfra is now available as a model provider (thanks Thachnh!) ([#7677](https://github.com/RooCodeInc/Roo-Code/pull/7677)):
42+
43+
- **100+ Models**: Access to a vast selection of open-source and frontier models
44+
- **Competitive Pricing**: Very cost-effective rates compared to other providers
45+
- **Automatic Prompt Caching**: Built-in prompt caching for supported models like Qwen3 Coder
46+
- **Fast Inference**: Optimized infrastructure for quick response times
47+
48+
DeepInfra is an excellent choice for developers looking for variety and value in their AI model selection.
49+
50+
> **📚 Documentation**: See [DeepInfra Provider Setup](/providers/deepinfra) to get started.
51+
52+
## QOL Improvements
53+
54+
* **Shell Security**: Added shell executable allowlist validation with platform-specific fallbacks for improved command execution safety ([#7681](https://github.com/RooCodeInc/Roo-Code/pull/7681))
55+
56+
## Bug Fixes
57+
58+
* **MCP Tool Validation**: Roo now validates MCP tool existence before execution and shows helpful error messages with available tools (thanks R-omk!) ([#7632](https://github.com/RooCodeInc/Roo-Code/pull/7632))
59+
* **OpenAI API Key Errors**: Clear error messages now display when API keys contain invalid characters instead of cryptic ByteString errors (thanks A0nameless0man!) ([#7586](https://github.com/RooCodeInc/Roo-Code/pull/7586))
60+
* **Follow-up Questions**: Fixed countdown timer incorrectly reappearing in task history for already answered follow-up questions (thanks XuyiK!) ([#7686](https://github.com/RooCodeInc/Roo-Code/pull/7686))
61+
* **Moonshot Token Limit**: Resolved issue where Moonshot models were incorrectly limited to 1024 tokens, now properly respects configured limits (thanks wangxiaolong100, greyishsong!) ([#7673](https://github.com/RooCodeInc/Roo-Code/pull/7673))
62+
* **Zsh Command Safety**: Improved handling of zsh process substitution and glob qualifiers to prevent auto-execution of potentially dangerous commands ([#7658](https://github.com/RooCodeInc/Roo-Code/pull/7658), [#7667](https://github.com/RooCodeInc/Roo-Code/pull/7667))
63+
* **Traditional Chinese Localization**: Fixed typo in zh-TW locale text (thanks PeterDaveHello!) ([#7672](https://github.com/RooCodeInc/Roo-Code/pull/7672))

docs/update-notes/v3.26.mdx

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,8 +94,32 @@ PRs: [#7474](https://github.com/RooCodeInc/Roo-Code/pull/7474), [#7492](https://
9494

9595
> **📚 Documentation**: See [Image Generation - Editing Existing Images](/features/image-generation#editing-existing-images) for transformation examples.
9696
97+
### Kimi K2-0905: Moonshot's Latest Open Source Model is Live in Roo Code
98+
99+
We've upgraded to the latest Kimi K2-0905 models across multiple providers (thanks CellenLee!) ([#7663](https://github.com/RooCodeInc/Roo-Code/pull/7663), [#7693](https://github.com/RooCodeInc/Roo-Code/pull/7693)):
100+
101+
K2-0905 comes with three major upgrades:
102+
- **256K Context Window**: Massive context supporting up to 256K-262K tokens, doubling the previous limit for processing much larger documents and conversations
103+
- **Improved Tool Calling**: Enhanced function calling and tool use capabilities for better agentic workflows
104+
- **Enhanced Front-end Development**: Superior HTML, CSS, and JavaScript generation with modern framework support
105+
106+
Available through Groq, Moonshot, and Fireworks providers. These models excel at handling large codebases, long conversations, and complex multi-file operations.
107+
108+
### OpenAI Service Tiers
109+
110+
We've added support for OpenAI's new Responses API service tiers ([#7646](https://github.com/RooCodeInc/Roo-Code/pull/7646)):
111+
112+
- **Standard Tier**: Default tier with regular pricing
113+
- **Flex Tier**: 50% discount with slightly longer response times for non-urgent tasks
114+
- **Priority Tier**: Faster response times for time-critical operations
115+
116+
Select your preferred tier directly in the UI based on your needs and budget. This gives you more control over costs while maintaining access to OpenAI's powerful models.
117+
118+
> **📚 Documentation**: See [OpenAI Provider Guide](/providers/openai) for detailed tier comparison and pricing.
119+
97120
### Provider Updates
98121

122+
* **DeepInfra Provider**: DeepInfra is now available as a model provider with 100+ open-source and frontier models, competitive pricing, and automatic prompt caching for supported models like Qwen3 Coder (thanks Thachnh!) ([#7677](https://github.com/RooCodeInc/Roo-Code/pull/7677))
99123
* **Kimi K2 Turbo Model**: Added support for the high-speed Kimi K2 Turbo model with 60-100 tokens/sec processing and a 131K token context window (thanks wangxiaolong100!) ([#7593](https://github.com/RooCodeInc/Roo-Code/pull/7593))
100124
* **Qwen3 235B Thinking Model**: Added support for Qwen3-235B-A22B-Thinking-2507 model with an impressive 262K context window, enabling processing of extremely long documents and large codebases in a single request through the Chutes provider (thanks mohammad154, apple-techie!) ([#7578](https://github.com/RooCodeInc/Roo-Code/pull/7578))
101125
* **Ollama Turbo Mode**: Added API key support for Turbo mode, enabling faster model execution with datacenter-grade hardware (thanks LivioGama!) ([#7425](https://github.com/RooCodeInc/Roo-Code/pull/7425))
@@ -104,6 +128,7 @@ PRs: [#7474](https://github.com/RooCodeInc/Roo-Code/pull/7474), [#7492](https://
104128

105129
### QOL Improvements
106130

131+
* **Shell Security**: Added shell executable allowlist validation with platform-specific fallbacks for improved command execution safety ([#7681](https://github.com/RooCodeInc/Roo-Code/pull/7681))
107132
* **Settings Scroll Position**: Settings tabs now remember their individual scroll positions when switching between them (thanks DC-Dancao!) ([#7587](https://github.com/RooCodeInc/Roo-Code/pull/7587))
108133
* **MCP Resource Auto-Approval**: MCP resource access requests are now automatically approved when auto-approve is enabled, eliminating manual approval steps and enabling smoother automation workflows (thanks m-ibm!) ([#7606](https://github.com/RooCodeInc/Roo-Code/pull/7606))
109134
* **Message Queue Performance**: Improved message queueing reliability and performance by moving the queue management to the extension host, making the interface more stable ([#7604](https://github.com/RooCodeInc/Roo-Code/pull/7604))
@@ -122,6 +147,12 @@ PRs: [#7474](https://github.com/RooCodeInc/Roo-Code/pull/7474), [#7492](https://
122147

123148
### Bug Fixes
124149

150+
* **MCP Tool Validation**: Roo now validates MCP tool existence before execution and shows helpful error messages with available tools (thanks R-omk!) ([#7632](https://github.com/RooCodeInc/Roo-Code/pull/7632))
151+
* **OpenAI API Key Errors**: Clear error messages now display when API keys contain invalid characters instead of cryptic ByteString errors (thanks A0nameless0man!) ([#7586](https://github.com/RooCodeInc/Roo-Code/pull/7586))
152+
* **Follow-up Questions**: Fixed countdown timer incorrectly reappearing in task history for already answered follow-up questions (thanks XuyiK!) ([#7686](https://github.com/RooCodeInc/Roo-Code/pull/7686))
153+
* **Moonshot Token Limit**: Resolved issue where Moonshot models were incorrectly limited to 1024 tokens, now properly respects configured limits (thanks wangxiaolong100, greyishsong!) ([#7673](https://github.com/RooCodeInc/Roo-Code/pull/7673))
154+
* **Zsh Command Safety**: Improved handling of zsh process substitution and glob qualifiers to prevent auto-execution of potentially dangerous commands ([#7658](https://github.com/RooCodeInc/Roo-Code/pull/7658), [#7667](https://github.com/RooCodeInc/Roo-Code/pull/7667))
155+
* **Traditional Chinese Localization**: Fixed typo in zh-TW locale text (thanks PeterDaveHello!) ([#7672](https://github.com/RooCodeInc/Roo-Code/pull/7672))
125156
* **Tool Approval Fix**: Fixed an error that occurred when using insert_content and search_and_replace tools on write-protected files - these tools now handle file protection correctly ([#7649](https://github.com/RooCodeInc/Roo-Code/pull/7649))
126157
* **Configurable Embedding Batch Size**: Fixed an issue where users with API providers having stricter batch limits couldn't use code indexing. You can now configure the embedding batch size (1-2048, default: 400) to match your provider's limits (thanks BenLampson!) ([#7464](https://github.com/RooCodeInc/Roo-Code/pull/7464))
127158
* **OpenAI-Native Cache Reporting**: Fixed cache usage statistics and cost calculations when using the OpenAI-Native provider with cached content ([#7602](https://github.com/RooCodeInc/Roo-Code/pull/7602))

sidebars.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,7 @@ const sidebars: SidebarsConfig = {
164164
'providers/claude-code',
165165
'providers/bedrock',
166166
'providers/cerebras',
167+
'providers/deepinfra',
167168
'providers/deepseek',
168169
'providers/doubao',
169170
'providers/featherless',
@@ -221,6 +222,7 @@ const sidebars: SidebarsConfig = {
221222
label: '3.26',
222223
items: [
223224
{ type: 'doc', id: 'update-notes/v3.26', label: '3.26 Combined' },
225+
{ type: 'doc', id: 'update-notes/v3.26.7', label: '3.26.7' },
224226
{ type: 'doc', id: 'update-notes/v3.26.6', label: '3.26.6' },
225227
{ type: 'doc', id: 'update-notes/v3.26.5', label: '3.26.5' },
226228
{ type: 'doc', id: 'update-notes/v3.26.4', label: '3.26.4' },

0 commit comments

Comments
 (0)