You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: Configure DeepInfra's high-performance AI models in Roo Code. Access Qwen Coder, Llama, and other open-source models with prompt caching and vision capabilities.
4
+
keywords:
5
+
- deepinfra
6
+
- deep infra
7
+
- roo code
8
+
- api provider
9
+
- qwen coder
10
+
- llama models
11
+
- prompt caching
12
+
- vision models
13
+
- open source ai
14
+
image: /img/social-share.jpg
15
+
---
16
+
17
+
# Using DeepInfra With Roo Code
18
+
19
+
DeepInfra provides cost-effective access to high-performance open-source models with features like prompt caching, vision support, and specialized coding models. Their infrastructure offers low latency and automatic load balancing across global edge locations.
description: Enhanced Kimi K2 models with 256K+ context windows, OpenAI service tiers for flexible pricing, and DeepInfra as a new provider with 100+ models.
3
+
keywords:
4
+
- roo code 3.26.7
5
+
- kimi k2 models
6
+
- openai service tiers
7
+
- deepinfra provider
8
+
- bug fixes
9
+
image: /img/social-share.jpg
10
+
---
11
+
12
+
# Roo Code 3.26.7 Release Notes (2025-09-05)
13
+
14
+
This release brings enhanced Kimi K2 models with massive context windows, OpenAI service tier selection, and DeepInfra as a new provider offering 100+ models.
15
+
16
+
## Kimi K2-0905: Moonshot's Latest Open Source Model is Live in Roo Code
17
+
18
+
We've upgraded to the latest Kimi K2-0905 models across multiple providers (thanks CellenLee!) ([#7663](https://github.com/RooCodeInc/Roo-Code/pull/7663), [#7693](https://github.com/RooCodeInc/Roo-Code/pull/7693)):
19
+
20
+
K2-0905 comes with three major upgrades:
21
+
-**256K Context Window**: Massive context supporting up to 256K-262K tokens, doubling the previous limit for processing much larger documents and conversations
22
+
-**Improved Tool Calling**: Enhanced function calling and tool use capabilities for better agentic workflows
23
+
-**Enhanced Front-end Development**: Superior HTML, CSS, and JavaScript generation with modern framework support
24
+
25
+
Available through Groq, Moonshot, and Fireworks providers. These models excel at handling large codebases, long conversations, and complex multi-file operations.
26
+
27
+
## OpenAI Service Tiers
28
+
29
+
We've added support for OpenAI's new Responses API service tiers ([#7646](https://github.com/RooCodeInc/Roo-Code/pull/7646)):
30
+
31
+
-**Standard Tier**: Default tier with regular pricing
32
+
-**Flex Tier**: 50% discount with slightly longer response times for non-urgent tasks
33
+
-**Priority Tier**: Faster response times for time-critical operations
34
+
35
+
Select your preferred tier directly in the UI based on your needs and budget. This gives you more control over costs while maintaining access to OpenAI's powerful models.
36
+
37
+
> **📚 Documentation**: See [OpenAI Provider Guide](/providers/openai) for detailed tier comparison and pricing.
38
+
39
+
## DeepInfra Provider
40
+
41
+
DeepInfra is now available as a model provider (thanks Thachnh!) ([#7677](https://github.com/RooCodeInc/Roo-Code/pull/7677)):
42
+
43
+
-**100+ Models**: Access to a vast selection of open-source and frontier models
44
+
-**Competitive Pricing**: Very cost-effective rates compared to other providers
45
+
-**Automatic Prompt Caching**: Built-in prompt caching for supported models like Qwen3 Coder
46
+
-**Fast Inference**: Optimized infrastructure for quick response times
47
+
48
+
DeepInfra is an excellent choice for developers looking for variety and value in their AI model selection.
49
+
50
+
> **📚 Documentation**: See [DeepInfra Provider Setup](/providers/deepinfra) to get started.
51
+
52
+
## QOL Improvements
53
+
54
+
***Shell Security**: Added shell executable allowlist validation with platform-specific fallbacks for improved command execution safety ([#7681](https://github.com/RooCodeInc/Roo-Code/pull/7681))
55
+
56
+
## Bug Fixes
57
+
58
+
***MCP Tool Validation**: Roo now validates MCP tool existence before execution and shows helpful error messages with available tools (thanks R-omk!) ([#7632](https://github.com/RooCodeInc/Roo-Code/pull/7632))
59
+
***OpenAI API Key Errors**: Clear error messages now display when API keys contain invalid characters instead of cryptic ByteString errors (thanks A0nameless0man!) ([#7586](https://github.com/RooCodeInc/Roo-Code/pull/7586))
60
+
***Follow-up Questions**: Fixed countdown timer incorrectly reappearing in task history for already answered follow-up questions (thanks XuyiK!) ([#7686](https://github.com/RooCodeInc/Roo-Code/pull/7686))
61
+
***Moonshot Token Limit**: Resolved issue where Moonshot models were incorrectly limited to 1024 tokens, now properly respects configured limits (thanks wangxiaolong100, greyishsong!) ([#7673](https://github.com/RooCodeInc/Roo-Code/pull/7673))
62
+
***Zsh Command Safety**: Improved handling of zsh process substitution and glob qualifiers to prevent auto-execution of potentially dangerous commands ([#7658](https://github.com/RooCodeInc/Roo-Code/pull/7658), [#7667](https://github.com/RooCodeInc/Roo-Code/pull/7667))
63
+
***Traditional Chinese Localization**: Fixed typo in zh-TW locale text (thanks PeterDaveHello!) ([#7672](https://github.com/RooCodeInc/Roo-Code/pull/7672))
> **📚 Documentation**: See [Image Generation - Editing Existing Images](/features/image-generation#editing-existing-images) for transformation examples.
96
96
97
+
### Kimi K2-0905: Moonshot's Latest Open Source Model is Live in Roo Code
98
+
99
+
We've upgraded to the latest Kimi K2-0905 models across multiple providers (thanks CellenLee!) ([#7663](https://github.com/RooCodeInc/Roo-Code/pull/7663), [#7693](https://github.com/RooCodeInc/Roo-Code/pull/7693)):
100
+
101
+
K2-0905 comes with three major upgrades:
102
+
-**256K Context Window**: Massive context supporting up to 256K-262K tokens, doubling the previous limit for processing much larger documents and conversations
103
+
-**Improved Tool Calling**: Enhanced function calling and tool use capabilities for better agentic workflows
104
+
-**Enhanced Front-end Development**: Superior HTML, CSS, and JavaScript generation with modern framework support
105
+
106
+
Available through Groq, Moonshot, and Fireworks providers. These models excel at handling large codebases, long conversations, and complex multi-file operations.
107
+
108
+
### OpenAI Service Tiers
109
+
110
+
We've added support for OpenAI's new Responses API service tiers ([#7646](https://github.com/RooCodeInc/Roo-Code/pull/7646)):
111
+
112
+
-**Standard Tier**: Default tier with regular pricing
113
+
-**Flex Tier**: 50% discount with slightly longer response times for non-urgent tasks
114
+
-**Priority Tier**: Faster response times for time-critical operations
115
+
116
+
Select your preferred tier directly in the UI based on your needs and budget. This gives you more control over costs while maintaining access to OpenAI's powerful models.
117
+
118
+
> **📚 Documentation**: See [OpenAI Provider Guide](/providers/openai) for detailed tier comparison and pricing.
119
+
97
120
### Provider Updates
98
121
122
+
***DeepInfra Provider**: DeepInfra is now available as a model provider with 100+ open-source and frontier models, competitive pricing, and automatic prompt caching for supported models like Qwen3 Coder (thanks Thachnh!) ([#7677](https://github.com/RooCodeInc/Roo-Code/pull/7677))
99
123
***Kimi K2 Turbo Model**: Added support for the high-speed Kimi K2 Turbo model with 60-100 tokens/sec processing and a 131K token context window (thanks wangxiaolong100!) ([#7593](https://github.com/RooCodeInc/Roo-Code/pull/7593))
100
124
***Qwen3 235B Thinking Model**: Added support for Qwen3-235B-A22B-Thinking-2507 model with an impressive 262K context window, enabling processing of extremely long documents and large codebases in a single request through the Chutes provider (thanks mohammad154, apple-techie!) ([#7578](https://github.com/RooCodeInc/Roo-Code/pull/7578))
101
125
***Ollama Turbo Mode**: Added API key support for Turbo mode, enabling faster model execution with datacenter-grade hardware (thanks LivioGama!) ([#7425](https://github.com/RooCodeInc/Roo-Code/pull/7425))
***Shell Security**: Added shell executable allowlist validation with platform-specific fallbacks for improved command execution safety ([#7681](https://github.com/RooCodeInc/Roo-Code/pull/7681))
107
132
***Settings Scroll Position**: Settings tabs now remember their individual scroll positions when switching between them (thanks DC-Dancao!) ([#7587](https://github.com/RooCodeInc/Roo-Code/pull/7587))
108
133
***MCP Resource Auto-Approval**: MCP resource access requests are now automatically approved when auto-approve is enabled, eliminating manual approval steps and enabling smoother automation workflows (thanks m-ibm!) ([#7606](https://github.com/RooCodeInc/Roo-Code/pull/7606))
109
134
***Message Queue Performance**: Improved message queueing reliability and performance by moving the queue management to the extension host, making the interface more stable ([#7604](https://github.com/RooCodeInc/Roo-Code/pull/7604))
***MCP Tool Validation**: Roo now validates MCP tool existence before execution and shows helpful error messages with available tools (thanks R-omk!) ([#7632](https://github.com/RooCodeInc/Roo-Code/pull/7632))
151
+
***OpenAI API Key Errors**: Clear error messages now display when API keys contain invalid characters instead of cryptic ByteString errors (thanks A0nameless0man!) ([#7586](https://github.com/RooCodeInc/Roo-Code/pull/7586))
152
+
***Follow-up Questions**: Fixed countdown timer incorrectly reappearing in task history for already answered follow-up questions (thanks XuyiK!) ([#7686](https://github.com/RooCodeInc/Roo-Code/pull/7686))
153
+
***Moonshot Token Limit**: Resolved issue where Moonshot models were incorrectly limited to 1024 tokens, now properly respects configured limits (thanks wangxiaolong100, greyishsong!) ([#7673](https://github.com/RooCodeInc/Roo-Code/pull/7673))
154
+
***Zsh Command Safety**: Improved handling of zsh process substitution and glob qualifiers to prevent auto-execution of potentially dangerous commands ([#7658](https://github.com/RooCodeInc/Roo-Code/pull/7658), [#7667](https://github.com/RooCodeInc/Roo-Code/pull/7667))
155
+
***Traditional Chinese Localization**: Fixed typo in zh-TW locale text (thanks PeterDaveHello!) ([#7672](https://github.com/RooCodeInc/Roo-Code/pull/7672))
125
156
***Tool Approval Fix**: Fixed an error that occurred when using insert_content and search_and_replace tools on write-protected files - these tools now handle file protection correctly ([#7649](https://github.com/RooCodeInc/Roo-Code/pull/7649))
126
157
***Configurable Embedding Batch Size**: Fixed an issue where users with API providers having stricter batch limits couldn't use code indexing. You can now configure the embedding batch size (1-2048, default: 400) to match your provider's limits (thanks BenLampson!) ([#7464](https://github.com/RooCodeInc/Roo-Code/pull/7464))
127
158
***OpenAI-Native Cache Reporting**: Fixed cache usage statistics and cost calculations when using the OpenAI-Native provider with cached content ([#7602](https://github.com/RooCodeInc/Roo-Code/pull/7602))
0 commit comments