3.33.0 second commit

hannesrudolph · hannesrudolph · commit 167673a8fb91 · 2025-11-18T12:13:30.000-07:00
diff --git a/docs/advanced-usage/rate-limits-costs.md b/docs/advanced-usage/rate-limits-costs.md
@@ -33,6 +33,8 @@ Most AI providers charge based on the number of tokens used. Pricing varies depe
 
 Roo Code automatically calculates the estimated cost of each API request based on the configured model's pricing. This cost is displayed in the chat history, next to the token usage.
 
+For reasoning-capable models (for example, Gemini 3 Pro Preview and other models that expose separate "thinking" or reasoning tokens), Roo Code now includes both normal tokens **and** reasoning / "thought" tokens in its estimates when the provider reports them. This can make the displayed token usage and cost slightly higher than in older versions, but it better matches how providers actually bill you.
+
 **Note:**
 
 *   The cost calculation is an *estimate*. The actual cost may vary slightly depending on the provider's billing practices.
diff --git a/docs/providers/gemini.md b/docs/providers/gemini.md
@@ -33,54 +33,38 @@ Roo Code supports Google's Gemini family of models through the Google AI Gemini
 
 ## Supported Models
 
-Roo Code supports the following Gemini models:
+Roo Code supports the main Gemini model families and automatically tracks Google's latest stable releases.
 
-### Model Aliases (Recommended)
+### Recommended Defaults
 
-For stability and automatic updates, use these aliases that point to the latest stable versions:
+- **Gemini 3 Pro Preview**
+  - 1M-token context window for very large workspaces and long-running conversations
+  - Reasoning-capable behavior for multi-step coding and refactoring tasks
+  - Tiered pricing support in Roo Code (≤200K vs >200K tokens) to better match Google's published pricing
+- **Gemini Pro family**
+  - Stable Pro models for complex coding, debugging, and analysis
+  - Roo Code defaults to a stable Pro model where your provider supports it (today this is a Gemini 2.5 Pro variant; future releases may point at newer Pro models)
+- **Gemini Flash family**
+  - Fast, lower-cost models for everyday tasks and quick iterations
 
-* `gemini-flash-latest` - Always uses the newest stable Flash model
-* `gemini-pro-latest` - Always uses the newest stable Pro model
+### Aliases
 
+For stability and automatic updates, prefer these aliases instead of hard-coding specific versioned model IDs:
 
-### Standard Models
-* `gemini-2.5-flash-preview-05-20`
-* `gemini-2.5-flash-preview-04-17`
-* `gemini-2.5-flash-lite-preview-06-17`
-* `gemini-2.5-pro-exp-03-25`
-* `gemini-2.0-flash-001`
-* `gemini-2.0-flash-lite-preview-02-05`
-* `gemini-2.0-pro-exp-02-05`
-* `gemini-2.0-flash-exp`
-* `gemini-1.5-flash-002`
-* `gemini-1.5-flash-exp-0827`
-* `gemini-1.5-flash-8b-exp-0827`
-* `gemini-1.5-pro-002`
-* `gemini-1.5-pro-exp-0827`
-* `gemini-exp-1206`
+- `gemini-flash-latest` – Points to the newest stable Flash model
+- `gemini-pro-latest` – Points to the newest stable Pro model
 
-### Preview Models
+Using aliases helps Roo Code follow Google's recommended stable releases without you having to update model IDs manually.
 
-Preview models include Google's latest experimental features but may change without notice:
+### Thinking / Reasoning Models
 
-* Models with `-preview-` in the name (e.g., `gemini-2.5-flash-preview-05-20`)
-* Models with `-exp-` suffix (e.g., `gemini-2.0-flash-exp`)
-* Models prefixed with `gemini-exp-` (e.g., `gemini-exp-1206`)
+Some Gemini models are reasoning-capable and may expose separate "thinking" or reasoning tokens:
 
-Preview models are ideal for testing cutting-edge capabilities but may have breaking changes. Use stable aliases for production work.
+- Roo Code treats these as reasoning models and can use them for deeper, multi-step planning.
+- To use reasoning models effectively, enable the **reasoning budget** feature in Roo Code settings.
+- When the Gemini API reports reasoning / "thought" token usage, Roo Code includes those tokens in its cost estimates so reported costs stay closer to your provider's billing.
 
-### Thinking Models
-These models require reasoning budget to be enabled in Roo Code settings:
-* `gemini-2.5-flash-preview-05-20:thinking`
-* `gemini-2.5-flash-preview-04-17:thinking`
-* `gemini-2.0-flash-thinking-exp-01-21`
-* `gemini-2.0-flash-thinking-exp-1219`
-
-:::info
-**Thinking Models:** Models with `:thinking` suffix or "thinking" in their name are hybrid reasoning models that provide step-by-step reasoning capabilities. To use these models, you must enable the reasoning budget feature in Roo Code settings.
-:::
-
-Refer to the [Gemini documentation](https://ai.google.dev/models/gemini) for more details on each model.
+Refer to the [Gemini documentation](https://ai.google.dev/models/gemini) for more details on each model family and its capabilities.
 
 ---
 
@@ -91,6 +75,8 @@ Refer to the [Gemini documentation](https://ai.google.dev/models/gemini) for mor
 3.  **Enter API Key:** Paste your Gemini API key into the "Gemini API Key" field.
 4.  **Select Model:** Choose your desired Gemini model from the "Model" dropdown.
 
+By default, Roo Code selects a stable Pro model (currently a Gemini 2.5 Pro variant) with a temperature of **1.0** where your provider supports it. This keeps suggestions more expressive and natural while still staying on task. If you need highly deterministic output (for example, for code generation in CI), you can lower the temperature toward `0.0`.
+
 ---
 
 ## Advanced Features
diff --git a/docs/providers/openai-compatible.md b/docs/providers/openai-compatible.md
@@ -49,21 +49,67 @@ You'll find these settings in the Roo Code settings panel (click the <Codicon na
 
 ---
 
-## Supported Models (for OpenAI Native Endpoint)
+## Native Tool Calling (OpenAI-Native Endpoint)
 
-While this provider type allows connecting to various endpoints, if you are connecting directly to the official OpenAI API (or an endpoint mirroring it exactly), Roo Code recognizes the following model IDs based on the `openAiNativeModels` definition in its source code:
+When you connect this provider directly to the official OpenAI API (or an endpoint that mirrors it exactly), Roo Code can use OpenAI's **native tool-calling** protocol instead of the XML-based tool format.
 
-*   `o3-mini`
-*   `o3-mini-high`
-*   `o3-mini-low`
-*   `o1`
-*   `o1-preview`
-*   `o1-mini`
-*   `gpt-4.5-preview`
-*   `gpt-4o`
-*   `gpt-4o-mini`
+At a high level:
 
-**Note:** If you are using a different OpenAI-compatible provider (like Together AI, Anyscale, etc.), the available model IDs will vary. Always refer to your specific provider's documentation for their supported model names.
+- **Tool definitions** are sent to the model using OpenAI's native tools schema.
+- **Tool calls** stream back as dedicated tool events, including the tool name, arguments, and metadata.
+- **Tool arguments** are streamed incrementally, which reduces latency between the model deciding to use a tool and Roo Code executing it.
+
+### When native tools are used
+
+Roo Code uses native tool calling when **all** of the following are true:
+
+1. The selected provider is configured for the OpenAI-native protocol (OpenAI or an OpenAI-compatible endpoint that fully supports native tools).
+2. The active profile's tool protocol is set to allow native tools (or left at its default, which prefers native tools when supported).
+3. The selected model supports native tool calling.
+
+If any of these conditions aren't met, Roo Code falls back to its XML-based tool protocol instead.
+
+### Example: simple native tool flow
+
+Here's a simplified example of how a file-reading tool might be exposed when using an OpenAI-native endpoint:
+
+```json
+{
+  "tools": [
+    {
+      "type": "function",
+      "function": {
+        "name": "read_file",
+        "description": "Read a file from the workspace with line numbers.",
+        "parameters": {
+          "type": "object",
+          "properties": {
+            "path": { "type": "string", "description": "Relative file path" },
+            "start_line": { "type": "integer", "nullable": true },
+            "end_line": { "type": "integer", "nullable": true }
+          },
+          "required": ["path"]
+        }
+      }
+    }
+  ]
+}
+```
+
+When the model decides to use `read_file`, Roo Code surfaces **streamed tool events** in the task timeline:
+
+- A native *tool call* event with the tool name and arguments as they're being generated
+- The corresponding *tool result* event showing the file contents and any truncation or line-range information
+
+This gives you lower-latency feedback on which tools are being used and with what arguments.
+
+### Settings and limitations
+
+- **Tool protocol selector:** In advanced settings, you can choose which tool protocol Roo Code should prefer (XML vs native). If you disable native tools here, Roo Code will always use XML even if the provider supports native tools.
+- **Model support:** Not all OpenAI-native or compatible models support tools. If a model doesn't support tools, Roo Code will not attempt to send tool definitions for it.
+- **Provider quirks:** Some OpenAI-compatible providers only partially implement the native tools API. If Roo Code detects protocol errors, it may fall back to XML tools automatically.
+
+For a deeper overview of how tools work in Roo Code in general, see the [Tool Use Overview](/advanced-usage/available-tools/tool-use-overview).
 
 ---
 
diff --git a/docs/providers/vertex.md b/docs/providers/vertex.md
@@ -43,31 +43,29 @@ If no model is specified, Roo Code defaults to `claude-sonnet-4@20250514`.
 
 ### Google Gemini Models
 
-#### Standard Models
-*   `gemini-2.5-flash` - Production version with prompt caching support
-*   `gemini-2.5-flash-preview-05-20` - Preview with 1M context window
-*   `gemini-2.5-flash-preview-04-17` - Preview without caching
-*   `gemini-2.5-flash-lite-preview-06-17` - Lite version with lower pricing
-*   `gemini-2.5-pro` - Production version with reasoning support
-*   `gemini-2.5-pro-preview-03-25` - Pro preview version
-*   `gemini-2.5-pro-preview-05-06` - Pro preview version
-*   `gemini-2.5-pro-preview-06-05` - Pro preview with reasoning support
-*   `gemini-2.5-pro-exp-03-25` - Experimental version (free)
-*   `gemini-2.0-flash-001` - 2.0 Flash model
-*   `gemini-2.0-flash-lite-001` - 2.0 Flash lite version
-*   `gemini-2.0-flash-thinking-exp-01-21` - Thinking/reasoning model
-*   `gemini-2.0-pro-exp-02-05` - 2.0 Pro experimental
-*   `gemini-1.5-flash-002` - 1.5 Flash model
-*   `gemini-1.5-pro-002` - 1.5 Pro model
-
-#### Thinking/Reasoning Models
-These models support enhanced reasoning capabilities with the `:thinking` suffix:
-*   `gemini-2.5-flash-preview-05-20:thinking`
-*   `gemini-2.5-flash-preview-04-17:thinking`
-
-:::info
-**Thinking Models:** Models with `:thinking` suffix enable step-by-step reasoning. The suffix is stripped before sending to the API but enables reasoning features in Roo Code. You'll need to enable the reasoning budget in settings to use these models effectively.
-:::
+Vertex AI exposes multiple Gemini model families. Roo Code focuses on the main families and tracks Google's stable releases instead of requiring you to hard-code versioned model IDs.
+
+#### Recommended Gemini options
+
+- **Gemini 3 Pro Preview**
+  - Up to a 1M-token context window for very large workspaces and long-running conversations
+  - Reasoning-capable behavior for complex coding and refactoring tasks
+  - Roo Code's cost estimation supports tiered pricing (short vs long requests) to better match Vertex AI billing for this model
+- **Gemini Pro family**
+  - Stable Pro models for complex reasoning and analysis
+  - When you select a Gemini model without overriding it in a profile, Roo Code prefers a stable Pro variant by default where available
+- **Gemini Flash family**
+  - Faster, lower-cost models ideal for quick iterations and non-critical tasks
+
+#### Reasoning / thinking models
+
+Some Gemini models provide dedicated reasoning or "thinking" tokens:
+
+- Roo Code treats these as reasoning models and uses them for deeper multi-step planning when enabled.
+- The reasoning budget must be enabled in Roo Code settings to take full advantage of these models.
+- When Vertex AI reports separate reasoning or "thought" tokens, Roo Code includes them in token usage and cost estimates. Compared to older versions, you may see slightly higher but more accurate token counts.
+
+Refer to the [Google Cloud Vertex AI models documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models) for up-to-date Gemini model IDs and capabilities.
 
 ### Anthropic Claude Models
 *   `claude-opus-4-1@20250805`
diff --git a/docs/update-notes/v3.33.0.mdx b/docs/update-notes/v3.33.0.mdx
@@ -9,7 +9,7 @@ image: /img/v3.33.0/v3.33.0.png
 
 # Roo Code 3.33.0 Release Notes (2025-11-18)
 
-Gemini 3 Pro Preview is now available in Roo Code and already performing extremely well in real-world coding tasks, alongside 16 tool-protocol and UI tweaks and fixes—thanks to everyone in the Roo Code community who helped shape this release.
+Gemini 3 Pro Preview is now available in Roo Code and already performing extremely well in real-world coding tasks, alongside 16 tweaks and fixes—thanks to everyone in the Roo Code community who helped shape this release.
 
 <img src="/img/v3.33.0/v3.33.0.png" alt="Roo Code v3.33.0 Release" width="600" />