-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix: add free tier (125k tokens) for Gemini 2.5 Pro models #7754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Added a new tier with 125k context window for free tier users - This prevents 429 errors when using Gemini 2.5 Pro with the free tier - The free tier has 0 cost for input/output/cache operations - Added comprehensive tests to verify tier configuration Fixes #7753
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing my own code is like debugging in a mirror - everything looks backwards but the bugs are still mine.
| tiers: [ | ||
| { | ||
| // Free tier: 125k input tokens per minute quota | ||
| contextWindow: 125_000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider extracting this magic number into a constant like GEMINI_25_PRO_FREE_TIER_LIMIT. The value 125_000 appears 8 times across the codebase, and having it as a constant would make future quota adjustments easier to manage.
| contextWindow: 125_000, | |
| // Free tier: 125k input tokens per minute quota | |
| contextWindow: GEMINI_25_PRO_FREE_TIER_LIMIT, |
| // Verify that the free tier limit is correctly set to prevent 429 errors | ||
| expect(freeTierLimit).toBe(125_000) | ||
| expect(expectedTriggerPoint).toBeLessThan(freeTierLimit) | ||
| expect(expectedTriggerPoint).toBe(87_500) // 70% of 125k |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better test maintainability, could we use the calculation directly instead of hardcoding the result?
| expect(expectedTriggerPoint).toBe(87_500) // 70% of 125k | |
| expect(expectedTriggerPoint).toBe(125_000 * 0.7) // 70% of 125k |
This makes it clearer that we're testing the 70% threshold and easier to update if the percentage changes.
| expect(expectedTriggerPoint).toBeLessThan(freeTierLimit) | ||
| expect(expectedTriggerPoint).toBe(87_500) // 70% of 125k | ||
| }) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be valuable to add a test that verifies the tier selection logic in the actual calculateCost method? While we're testing the tier configuration here, we're not directly testing that the correct tier gets selected when processing requests with <125k tokens.
Something like:
it("should select free tier for requests under 125k tokens", () => {
// Test that calculateCost selects the free tier
// when input tokens are below 125k
})|
Closing, see #7753 (comment) |
Summary
This PR addresses Issue #7753 by adding a free tier configuration for Gemini 2.5 Pro models to prevent 429 errors when using the free tier.
Problem
Google recently reduced the free tier input token quota for Gemini 2.5 Pro from 250k to 125k tokens. Roo Code was still attempting to send requests up to the old 250k limit, causing repeated 429 "quota exceeded" errors for free tier users.
Solution
Added a new tier configuration with a 125k context window for the free tier across all Gemini 2.5 Pro models:
gemini-2.5-progemini-2.5-pro-preview-03-25gemini-2.5-pro-preview-05-06gemini-2.5-pro-preview-06-05The free tier is configured with:
This ensures that Intelligent Context Condensing will trigger before reaching the 125k limit, preventing 429 errors.
Testing
src/api/providers/__tests__/gemini-tier-config.spec.tsImpact
This fix will prevent 429 errors for users on the Gemini 2.5 Pro free tier and ensure that context condensing triggers appropriately before hitting the quota limit.
Fixes #7753
Important
Adds a free tier with a 125k token limit to Gemini 2.5 Pro models to prevent 429 errors, with tests verifying the configuration.
gemini-2.5-pro,gemini-2.5-pro-preview-03-25,gemini-2.5-pro-preview-05-06, andgemini-2.5-pro-preview-06-05ingemini.ts.gemini-tier-config.spec.tsto verify the free tier configuration and tier ordering.This description was created by
for e2e4f00. You can customize this summary. It will automatically update as commits are pushed.