fix: remove explicit cache_control for Google models in OpenRouter (#4487) #4488

hannesrudolph · 2025-06-10T03:44:41Z

Related GitHub Issue

Closes: #4487

Description

This PR fixes the 3+ minute lag issue when using google/gemini-2.5-pro-preview through OpenRouter by removing explicit cache_control flags for this specific model.

Key implementation details:

Excluded google/gemini-2.5-pro-preview from the OPEN_ROUTER_PROMPT_CACHING_MODELS set in packages/types/src/providers/openrouter.ts
Added clear comment explaining the exclusion with issue reference
Updated test logic to handle the intentional exclusion of this specific model
OpenRouter still provides automatic implicit ephemeral caching for this model, so caching benefits are preserved
The fix specifically targets the lag caused by explicit "cache_control": { "type": "ephemeral" } flags being added to requests

Reviewers should pay attention to:

Only the google/gemini-2.5-pro-preview model is affected - all other models continue to work as before
The surgical approach preserves explicit caching for all other models that work properly

Test Procedure

Unit Tests:

# Run OpenRouter-specific tests
npx vitest run api/providers/fetchers/__tests__/openrouter.spec.ts

# Run all provider tests
npx vitest run api/providers/__tests__/

Manual Testing:

Use google/gemini-2.5-pro-preview model through OpenRouter
Verify response time is significantly reduced (from 3+ minutes to normal response time)
Verify caching still works (OpenRouter provides automatic implicit caching)
Verify other Google models continue to work with explicit cache control
Verify non-Google models continue to work with explicit cache control

Testing Environment:

All tests pass locally
Linting passes
Type checking passes
No breaking changes to existing functionality

Type of Change

🐛 Bug Fix: Non-breaking change that fixes an issue.
✨ New Feature: Non-breaking change that adds functionality.
💥 Breaking Change: Fix or feature that would cause existing functionality to not work as expected.
♻️ Refactor: Code change that neither fixes a bug nor adds a feature.
💅 Style: Changes that do not affect the meaning of the code (white-space, formatting, etc.).
📚 Documentation: Updates to documentation files.
⚙️ Build/CI: Changes to the build process or CI configuration.
🧹 Chore: Other changes that don't modify src or test files.

Pre-Submission Checklist

Screenshots / Videos

Not applicable - this is a performance fix with no UI changes.

Documentation Updates

No documentation updates are required.
Yes, documentation updates are required.

This change is internal to the caching implementation and doesn't affect user-facing behavior beyond improved performance.

Additional Notes

Model affected:

google/gemini-2.5-pro-preview - No longer uses explicit cache_control (prevents 3+ minute lag)

Models NOT affected (continue to use explicit caching as before):

All other Google models (google/gemini-2.5-flash-preview, google/gemini-2.0-flash-001, etc.)
All Anthropic models
All other provider models

Impact:

✅ Eliminates 3+ minute lag for google/gemini-2.5-pro-preview
✅ Preserves caching benefits through OpenRouter's automatic system
✅ No breaking changes - other models continue to work as before
✅ Surgical fix - minimal scope, maximum effectiveness

Get in Touch

I'm available through GitHub for any questions about this PR.

Important

Remove explicit cache control for Google models in openrouter.ts to fix lag issue, updating tests accordingly.

Behavior:
- Removed Google models from OPEN_ROUTER_PROMPT_CACHING_MODELS in openrouter.ts to fix lag issue.
- OpenRouter still provides implicit ephemeral caching for these models.
Tests:
- Updated openrouter.spec.ts to exclude Google models from caching tests.
- Ensured test logic matches exclusion list for caching models.
Impact:
- Eliminates 3+ minute lag for Google Gemini models.
- No other models affected; non-Google models continue with explicit cache control.

^{This description was created by}^{for a904d67. You can customize this summary. It will automatically update as commits are pushed.}

…4487) - Remove all Google models from OPEN_ROUTER_PROMPT_CACHING_MODELS set - This resolves 3+ minute lag when using google/gemini-2.5-pro-preview - OpenRouter still provides automatic implicit ephemeral caching for these models - Updated tests to handle intentional exclusion of Google models from explicit caching Fixes #4487

- Replace hardcoded exclusion list with simple Google model filter - Keep original validation logic but make it more maintainable - Still ensures all our caching models are supported by OpenRouter - Still verifies we exclude all Google models from explicit caching

- Variable was defined but never used - Keeps the test logic clean and focused

daniel-lxs

LGTM, now we just wait for the tests

mrubens · 2025-06-10T04:14:40Z

@cte can you take a look as well? Not really sure what the implications are of removing these from cached models. Do we still need the code here?

Roo-Code/src/api/providers/openrouter.ts

Lines 100 to 101 in 483d951

    
           if (modelId.startsWith("google")) { 
        
           	addGeminiCacheBreakpoints(systemPrompt, openAiMessages)

daniel-lxs · 2025-06-10T04:18:01Z

It seems that these models are now under implicit caching, @hannesrudolph confirmed that the caching is still enabled for these.

I'm not sure if there's any benefits to explicit caching at this point.

Edit: here's the documentation from OpenRouter: https://openrouter.ai/docs/features/prompt-caching
The caching headers are not necessary for Gemini 2.5 Pro and 2.5 Flash.

So Gemini 1.5 might still benefit from the headers.

- More surgical approach - only exclude the specific problematic model - Keep other Google models in caching (they work fine) - Add comment explaining the exclusion with issue reference - Update test to only exclude the specific model This targets just the model causing 3+ minute lag while preserving caching benefits for other Google models that work properly.

hannesrudolph · 2025-06-10T04:30:19Z

hannesrudolph · 2025-06-10T04:34:48Z

I have rolled the PR back to just changes related to google/gemini-2.5-pro-preview so we can get it running quickly asap. Quickfix!

… pick from RooCodeInc/Roo-Code#4488)

hannesrudolph requested review from cte, jr and mrubens as code owners June 10, 2025 03:44

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Jun 10, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Jun 10, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Jun 10, 2025

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Jun 10, 2025

hannesrudolph added 2 commits June 9, 2025 22:04

cleanup: remove unused excludedModels variable

208969e

- Variable was defined but never used - Keeps the test logic clean and focused

dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jun 10, 2025

daniel-lxs approved these changes Jun 10, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 10, 2025

daniel-lxs moved this from Triage to PR [Needs Review] in Roo Code Roadmap Jun 10, 2025

mrubens approved these changes Jun 10, 2025

View reviewed changes

mrubens merged commit bf35dcd into main Jun 10, 2025
10 checks passed

mrubens deleted the openrouter-cache-fix branch June 10, 2025 04:48

github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Jun 10, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Jun 10, 2025

chrarnoldus added a commit to Kilo-Org/kilocode that referenced this pull request Jun 10, 2025

remove explicit cache_control for Google models in OpenRouter (cherry…

aa14636

… pick from RooCodeInc/Roo-Code#4488)

chrarnoldus mentioned this pull request Jun 10, 2025

Fix gemini-2.5-pro-preview being very slow Kilo-Org/kilocode#672

Merged

hassoncs pushed a commit to Kilo-Org/kilocode that referenced this pull request Jun 10, 2025

remove explicit cache_control for Google models in OpenRouter (cherry…

8f07a5f

… pick from RooCodeInc/Roo-Code#4488)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: remove explicit cache_control for Google models in OpenRouter (#4487) #4488

fix: remove explicit cache_control for Google models in OpenRouter (#4487) #4488

Uh oh!

hannesrudolph commented Jun 10, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

daniel-lxs left a comment

Uh oh!

mrubens commented Jun 10, 2025

Uh oh!

daniel-lxs commented Jun 10, 2025 •

edited

Loading

Uh oh!

hannesrudolph commented Jun 10, 2025

Uh oh!

hannesrudolph commented Jun 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: remove explicit cache_control for Google models in OpenRouter (#4487) #4488

fix: remove explicit cache_control for Google models in OpenRouter (#4487) #4488

Uh oh!

Conversation

hannesrudolph commented Jun 10, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related GitHub Issue

Description

Test Procedure

Type of Change

Pre-Submission Checklist

Screenshots / Videos

Documentation Updates

Additional Notes

Get in Touch

Uh oh!

daniel-lxs left a comment

Choose a reason for hiding this comment

Uh oh!

mrubens commented Jun 10, 2025

Uh oh!

daniel-lxs commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hannesrudolph commented Jun 10, 2025

Uh oh!

hannesrudolph commented Jun 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hannesrudolph commented Jun 10, 2025 •

edited by ellipsis-dev bot

Loading

daniel-lxs commented Jun 10, 2025 •

edited

Loading