Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Sep 7, 2025

Summary

This PR addresses Issue #7753 by adding a free tier configuration for Gemini 2.5 Pro models to prevent 429 errors when using the free tier.

Problem

Google recently reduced the free tier input token quota for Gemini 2.5 Pro from 250k to 125k tokens. Roo Code was still attempting to send requests up to the old 250k limit, causing repeated 429 "quota exceeded" errors for free tier users.

Solution

Added a new tier configuration with a 125k context window for the free tier across all Gemini 2.5 Pro models:

  • gemini-2.5-pro
  • gemini-2.5-pro-preview-03-25
  • gemini-2.5-pro-preview-05-06
  • gemini-2.5-pro-preview-06-05

The free tier is configured with:

  • Context window: 125,000 tokens
  • Input price: /bin/sh
  • Output price: /bin/sh
  • Cache reads price: /bin/sh

This ensures that Intelligent Context Condensing will trigger before reaching the 125k limit, preventing 429 errors.

Testing

  • Added comprehensive tests in src/api/providers/__tests__/gemini-tier-config.spec.ts
  • All tests pass (17 new tests)
  • Existing Gemini provider tests continue to pass
  • Type checking and linting pass

Impact

This fix will prevent 429 errors for users on the Gemini 2.5 Pro free tier and ensure that context condensing triggers appropriately before hitting the quota limit.

Fixes #7753


Important

Adds a free tier with a 125k token limit to Gemini 2.5 Pro models to prevent 429 errors, with tests verifying the configuration.

  • Behavior:
    • Adds a free tier with a 125k token context window to gemini-2.5-pro, gemini-2.5-pro-preview-03-25, gemini-2.5-pro-preview-05-06, and gemini-2.5-pro-preview-06-05 in gemini.ts.
    • Ensures context condensing triggers before reaching the 125k limit to prevent 429 errors.
  • Testing:
    • Adds tests in gemini-tier-config.spec.ts to verify the free tier configuration and tier ordering.
    • Confirms no free tier is added to non-2.5-pro models.

This description was created by Ellipsis for e2e4f00. You can customize this summary. It will automatically update as commits are pushed.

- Added a new tier with 125k context window for free tier users
- This prevents 429 errors when using Gemini 2.5 Pro with the free tier
- The free tier has 0 cost for input/output/cache operations
- Added comprehensive tests to verify tier configuration

Fixes #7753
Copy link
Contributor Author

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing my own code is like debugging in a mirror - everything looks backwards but the bugs are still mine.

tiers: [
{
// Free tier: 125k input tokens per minute quota
contextWindow: 125_000,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting this magic number into a constant like GEMINI_25_PRO_FREE_TIER_LIMIT. The value 125_000 appears 8 times across the codebase, and having it as a constant would make future quota adjustments easier to manage.

Suggested change
contextWindow: 125_000,
// Free tier: 125k input tokens per minute quota
contextWindow: GEMINI_25_PRO_FREE_TIER_LIMIT,

// Verify that the free tier limit is correctly set to prevent 429 errors
expect(freeTierLimit).toBe(125_000)
expect(expectedTriggerPoint).toBeLessThan(freeTierLimit)
expect(expectedTriggerPoint).toBe(87_500) // 70% of 125k
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For better test maintainability, could we use the calculation directly instead of hardcoding the result?

Suggested change
expect(expectedTriggerPoint).toBe(87_500) // 70% of 125k
expect(expectedTriggerPoint).toBe(125_000 * 0.7) // 70% of 125k

This makes it clearer that we're testing the 70% threshold and easier to update if the percentage changes.

expect(expectedTriggerPoint).toBeLessThan(freeTierLimit)
expect(expectedTriggerPoint).toBe(87_500) // 70% of 125k
})
})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be valuable to add a test that verifies the tier selection logic in the actual calculateCost method? While we're testing the tier configuration here, we're not directly testing that the correct tier gets selected when processing requests with <125k tokens.

Something like:

it("should select free tier for requests under 125k tokens", () => {
  // Test that calculateCost selects the free tier
  // when input tokens are below 125k
})

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Sep 7, 2025
@daniel-lxs
Copy link
Member

Closing, see #7753 (comment)

@daniel-lxs daniel-lxs closed this Sep 9, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 9, 2025
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Gemini 2.5 Pro free tier quota reduced → Roo Code still sending 250k tokens (429 errors, condensing not triggering)

4 participants