RooCodeInc
diff --git a/‎docs/advanced-usage/rate-limits-costs.md‎
Lines changed: 13 additions & 0 deletions b/‎docs/advanced-usage/rate-limits-costs.md‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎docs/features/custom-modes.mdx‎
Lines changed: 250 additions & 102 deletions b/‎docs/features/custom-modes.mdx‎
Lines changed: 250 additions & 102 deletions
diff --git a/‎docs/features/experimental/experimental-features.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/features/experimental/experimental-features.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/features/experimental/intelligent-context-condensation.md‎
Lines changed: 0 additions & 45 deletions b/‎docs/features/experimental/intelligent-context-condensation.md‎
Lines changed: 0 additions & 45 deletions
diff --git a/‎docs/features/experimental/intelligent-context-condensing.mdx‎
Lines changed: 119 additions & 0 deletions b/‎docs/features/experimental/intelligent-context-condensing.mdx‎
Lines changed: 119 additions & 0 deletions
diff --git a/‎docs/providers/gemini.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/providers/gemini.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/providers/lmstudio.md‎
Lines changed: 2 additions & 0 deletions b/‎docs/providers/lmstudio.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/providers/ollama.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/providers/ollama.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/providers/unbound.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/providers/unbound.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/providers/vertex.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/providers/vertex.md‎
Lines changed: 1 addition & 0 deletions
@@ -23,6 +23,19 @@ Roo Code automatically calculates the estimated cost of each API request based o
 *   Some providers may offer free tiers or credits. Check your provider's documentation for details.
 *   Some providers offer prompt caching which greatly lowers cost.
 
+### Limiting Auto-Approved Requests
+
+To further help manage API costs and prevent unexpected expenses, Roo Code includes a "Max Requests" setting for auto-approved actions. This allows you to define a specific limit on how many consecutive API calls Roo Code can make without requiring your explicit re-approval during a task.
+
+*   **How it works:** If you set a limit (e.g., 5 requests), Roo Code will perform up to 5 auto-approved API calls. Before making the 6th call, it will pause and prompt you to "Reset and Continue," as shown below.
+    <img src="/img/v3.18.0/v3.18.0-1.png" alt="Warning message indicating the auto-approved request limit has been reached." width="600" />
+    *Notification when the auto-approved request limit is met.*
+*   **Configuration:** This limit is configured within the "Auto-approve actions" settings. You can set a specific number or choose "Unlimited." For detailed steps on configuring this and other auto-approval settings, see the [Auto-Approving Actions documentation](/features/auto-approving-actions).
+    <img src="/img/v3.18.0/v3.18.0.png" alt="Setting the Max Requests limit for auto-approved actions in Roo Code settings." width="600" />
+    *Setting the "Max Requests" for auto-approved actions.*
+
+This feature provides an additional safeguard, particularly for complex or long-running tasks where multiple API calls might be involved.
+
 ## Tips for Optimizing Token Usage
 
 *   **Be Concise:** Use clear and concise language in your prompts. Avoid unnecessary words or details.
 
@@ -16,7 +16,7 @@ To enable or disable experimental features:
 
 The following experimental features are currently available:
 
-- [Intelligently Condense the Context Window](/features/experimental/intelligent-context-condensation)
+- [Intelligently Condense the Context Window](/features/experimental/intelligent-context-condensing)
 - [Power Steering](/features/experimental/power-steering)
 
 ## Providing Feedback
 
@@ -0,0 +1,119 @@
+---
+sidebar_label: 'Intelligent Context Condensing'
+---
+import Codicon from '@site/src/components/Codicon';
+
+# Intelligent Context Condensing (Experimental)
+
+The Intelligent Context Condensing feature helps manage long conversations by summarizing earlier parts of the dialogue. This prevents important information from being lost when the context window nears its limit. This is an **experimental feature** and is **disabled by default**.
+
+<div style={{width: '50%', margin: 'auto'}}>
+  <div style={{position: 'relative', paddingBottom: '177.77%', height: 0, overflow: 'hidden'}}>
+    <iframe
+      src="https://www.youtube.com/embed/o5xgO9N8vVU"
+      title="YouTube Short"
+      frameBorder="0"
+      allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
+      allowFullScreen
+      style={{position: 'absolute', top: 0, left: 0, width: '100%', height: '100%'}}
+    ></iframe>
+  </div>
+</div>
+## How It Works
+
+As your conversation with Roo Code grows, it might approach the context window limit of the underlying AI model. When this happens, older messages would typically be removed to make space. Intelligent Context Condensing aims to prevent this abrupt loss by:
+
+1.  **Summarizing:** Using an AI model, it condenses earlier parts of the conversation.
+2.  **Retaining Essentials:** The goal is to reduce the overall token count while keeping the key information from the summarized messages.
+3.  **Maintaining Flow:** This allows the AI to have a more coherent understanding of the entire conversation, even very long ones.
+
+**Important Considerations:**
+*   **Summarization Impact:** While original messages are preserved if you use [Checkpoints](/features/checkpoints) to rewind, the summarized version is what's used in ongoing LLM calls to keep the context manageable.
+*   **Cost:** The AI call to perform the summarization incurs a cost. This cost is included in the context condensing metrics displayed in the UI.
+
+## Enabling This Feature
+
+As an experimental feature, Intelligent Context Condensing is **disabled by default**.
+
+1.  Open Roo Code settings (<Codicon name="gear" /> icon in the top right corner of the Roo Code panel).
+2.  Navigate to the "Experimental" section.
+3.  Toggle the "Automatically trigger intelligent context condensing" (`autoCondenseContext`) option to enable it.
+4.  Optionally, adjust the "Threshold to trigger intelligent context condensing" (`autoCondenseContextPercent`) slider to control the trigger point for automatic context condensing.
+5.  Save your changes.
+
+<img src="/img/intelligent-context-condensation/intelligent-context-condensation-1.png" alt="Settings for Intelligent Context Condensing" width="600" />
+*The image above shows settings for Intelligent Context Condensing: the toggle to "Automatically trigger intelligent context condensing" and the "Threshold to trigger intelligent context condensing" slider.*
+
+## Controlling and Understanding Context Condensing
+
+Roo Code provides several ways to control and understand the Intelligent Context Condensing feature:
+
+### Controlling Context Condensing
+*   **Automatic Threshold:** In Roo Code Settings (<Codicon name="gear" />) > "Experimental," the `autoCondenseContextPercent` setting allows you to define a percentage (e.g., 80%). Roo Code will attempt to condense the context automatically when the conversation reaches this level of the context window's capacity.
+*   **Manual Trigger:** A **Condense Context** button (<Codicon name="fold" /> icon) is available when a task is expanded, typically located at the bottom of the task view, next to other task action icons like the trash can. This allows you to initiate the context condensing process at any time.
+
+    <img src="/img/intelligent-context-condensation/intelligent-context-condensation-2.png" alt="Manual Condense Context button in expanded task view" width="600" />
+    *The Manual Condense Context button (highlighted with a yellow arrow) appears in the expanded task view.*
+
+### Understanding Context Condensing Activity
+*   **Context Condensing Metrics:** When context condensing occurs, Roo Code displays:
+    *   The context token counts before and after context condensing.
+    *   The cost associated with the context condensing AI call.
+    *   An expandable summary detailing what was condensed (this information is part of the `ContextCondenseRow` component visible in the chat history).
+
+<img src="/img/intelligent-context-condensation/intelligent-context-condensation-4.png" alt="Context condensed message in chat" width="600" />
+*After context condensing, a message indicates the context has been condensed, showing token changes and cost.*
+
+*   **Visual Indicators:**
+    *   A progress indicator ("Condensing context...") is shown in the chat interface while context condensing is active.
+
+<img src="/img/intelligent-context-condensation/intelligent-context-condensation-3.png" alt="Condensing context progress indicator in chat" width="600" />
+*The "Condensing context..." indicator appears in the chat during the process.*
+
+    *   The task header also displays the current context condensing status.
+    *   The `ContextWindowProgress` bar offers a visual representation of token distribution, including current usage, space reserved for the AI's output, available space, and raw token numbers.
+*   **Interface Clarity:** The "Condense Context" button includes a tooltip explaining its function, available in all supported languages. The icon for context condensing-related actions is `codicon-compress`.
+
+### Accurate Token Information
+*   Roo Code employs accurate token counting methods, with some AI providers utilizing their native token counting endpoints. This ensures that context size and associated costs are calculated reliably.
+
+### Internationalization
+*   All user interface elements for this feature, such as button labels, tooltips, status messages, and settings descriptions, are available in multiple supported languages.
+
+## Technical Implementation
+
+### Token Counting
+Roo Code uses a sophisticated token counting system that:
+- Employs native token counting endpoints when available (e.g., Anthropic's API)
+- Falls back to tiktoken estimation if API calls fail
+- Provides accurate counting for different content types:
+  - Text content: Uses word-based estimation with punctuation and newline overhead
+  - Image content: Uses a conservative estimate of 300 tokens per image
+  - System prompts: Includes additional overhead for structural elements
+
+### Context Window Management
+- By default, 30% of the context window is reserved (20% for model output and 10% as a safety buffer), leaving 70% available for conversation history.
+- This reservation can be overridden by model-specific settings
+- The system automatically calculates available space while maintaining this reservation
+
+## Performance Considerations
+
+### Optimization
+- The system optimizes token counting to minimize performance impact
+- Token calculations are cached where possible
+- Background processing prevents UI blocking during context condensing
+
+### Resource Usage
+- Context condensing operations are performed asynchronously
+- The UI remains responsive during the process
+- System resources are managed to prevent excessive memory usage
+
+## Feedback
+
+Your experience with experimental features is valuable. When reporting issues, please include:
+- The current threshold setting
+- The token counts before and after context condensing
+- Any error messages displayed
+- Steps to reproduce the issue
+
+Please report any issues or suggestions regarding Intelligent Context Condensing on the [Roo Code GitHub Issues page](https://github.com/RooCodeInc/Roo-Code/issues).
@@ -19,6 +19,7 @@ Roo Code supports Google's Gemini family of models through the Google AI Gemini
 
 Roo Code supports the following Gemini models:
 
+* `gemini-2.5-flash-preview-05-20`
 * `gemini-2.5-pro-exp-03-25`
 * `gemini-2.0-flash-001`
 * `gemini-2.0-flash-lite-preview-02-05`
 
@@ -38,3 +38,5 @@ Roo Code supports running models locally using LM Studio.  LM Studio provides a
 *   **Local Server:**  The LM Studio local server must be running for Roo Code to connect to it.
 *   **LM Studio Documentation:** Refer to the [LM Studio documentation](https://lmstudio.ai/docs) for more information.
 *   **Troubleshooting:** If you see a "Please check the LM Studio developer logs to debug what went wrong" error, you may need to adjust the context length settings in LM Studio.
+*   **Token Tracking:** Roo Code tracks token usage for models run via LM Studio, helping you monitor consumption.
+*   **Reasoning Support:** For models that support it, Roo Code can parse "think" tags or similar reasoning indicators in LM Studio responses, offering more insight into the model's process.
@@ -72,4 +72,5 @@ Roo Code supports running models locally using Ollama. This provides privacy, of
 *   **Resource Requirements:** Running large language models locally can be resource-intensive.  Make sure your computer meets the minimum requirements for the model you choose.
 *   **Model Selection:** Experiment with different models to find the one that best suits your needs.
 *   **Offline Use:** Once you've downloaded a model, you can use Roo Code offline with that model.
+*   **Token Tracking:** Roo Code tracks token usage for models run via Ollama, helping you monitor consumption.
 *   **Ollama Documentation:** Refer to the [Ollama documentation](https://ollama.com/docs) for more information on installing, configuring, and using Ollama.
@@ -28,3 +28,4 @@ Unbound allows you configure a list of supported models in your application, and
 ## Tips and Notes
 
 * **Security Focus:** Unbound emphasizes security features for enterprise use. If your organization has strict security requirements for AI usage, Unbound might be a good option.
+*   **Model List Refresh:** Roo Code includes a refresh button specifically for the Unbound provider in the settings. This allows you to easily update the list of available models from your Unbound application and get immediate feedback on your API key's validity.
@@ -23,6 +23,7 @@ Roo Code supports accessing models through Google Cloud Platform's Vertex AI, a
 Roo Code supports the following models through Vertex AI (based on source code):
 
 *   **Google Gemini Models:**
+    *   `gemini-2.5-flash-preview-05-20`
     *   `gemini-2.0-flash-001`
     *   `gemini-2.5-pro-exp-03-25`
     *   `gemini-2.0-pro-exp-02-05`
Original file line number	Diff line number	Diff line change
`@@ -28,3 +28,4 @@ Unbound allows you configure a list of supported models in your application, and`
`28`	`28`	`## Tips and Notes`
`29`	`29`
`30`	`30`	`* Security Focus: Unbound emphasizes security features for enterprise use. If your organization has strict security requirements for AI usage, Unbound might be a good option.`
	`31`	`+* Model List Refresh: Roo Code includes a refresh button specifically for the Unbound provider in the settings. This allows you to easily update the list of available models from your Unbound application and get immediate feedback on your API key's validity.`