Skip to content

Gemini 2.5 Pro free tier quota reduced β†’ Roo Code still sending 250k tokens (429 errors, condensing not triggering)Β #7753

@serge402

Description

@serge402

App Version

3.27.0

API Provider

Google Gemini

Model Used

gemini-2.5-pro

Roo Code Task Links (Optional)

When using Gemini 2.5 Pro with Roo Code, I keep hitting 429 errors from the API. The error looks like this:

Gemini generate context stream error: got status: 429 Too Many Requests. {"error":{"message":"{
  "error": {
    "code": 429,
    "message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.QuotaFailure",
        "violations": [
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count",
            "quotaId": "GenerateContentInputTokensPerModelPerMinute-FreeTier",
            "quotaDimensions": {
              "location": "global",
              "model": "gemini-2.5-pro"
            },
            "quotaValue": "125000"
          }
        ]
      }
    ]
  }
}"}

The root cause seems to be that Google reduced the free tier input token quota for Gemini 2.5 Pro from 250k β†’ 125k, but Roo Code still attempts to send requests assuming the old 250k token limit.

I have Intelligent Context Condensing enabled (which should automatically condense context when reaching the configured threshold), but it doesn’t appear to activate early enough for the new 125k quota.

πŸ” Steps to Reproduce

1. Use Gemini 2.5 Pro with the free tier in Roo Code.

2. Ensure Intelligent Context Condensing is enabled.

3. Work with a large enough session/context that would normally fit under the old 250k limit.

4. Roo Code sends >125k input tokens β†’ API responds with a 429 error.

Image

πŸ’₯ Outcome Summary

Expected Behavior:
Roo Code should detect the updated 125k input token quota for Gemini 2.5 Pro free tier and trigger Intelligent Context Condensing earlier to stay under that limit.

Actual Behavior:
Roo Code still sends requests up to the old 250k limit, causing repeated 429 errors.

πŸ“„ Relevant Logs or Errors (Optional)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue/PR - TriageNew issue. Needs quick review to confirm validity and assign labels.bugSomething isn't working

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions