No way to enforce 125k input token limit for Gemini 2.5 Pro free tier (makes free tier unusable)

### What specific problem does this solve?

I’ve been using Roo Code with the Gemini 2.5 Pro free tier and consistently run into 429 errors. After reviewing the docs and experimenting with API configuration profiles and condensing settings, I found:

- Setting API configuration profiles with a 60s rate limit does not help. A single request can still exceed the free-tier cap.

- Lowering Intelligent Context Condensing to 20% also does not enforce the 125k input token maximum.

- As a result, there’s no way to constrain requests to stay under Google’s new 125k input token quota.

This means that Gemini 2.5 Pro free tier is currently not usable with Roo Code, since requests routinely exceed the quota and fail immediately with 429 errors.

<img width="1053" height="169" alt="Image" src="https://github.com/user-attachments/assets/dfad2549-27f0-4648-9531-2b8f9d7556ea" />

Adding to the problem, other “free model” options (like Grok Fast 1) have just expired as of Sept 10. That leaves no practical free alternative available, so fixing this is important for anyone relying on the Gemini free tier.

### Additional context (optional)

**Steps to Reproduce:**
1. Use Gemini 2.5 Pro free tier with Roo Code.
2. Create a session large enough to exceed 125k tokens.
3. Set a rate limit of 60s in API configuration profiles.
4. Lower Intelligent Context Condensing threshold to 20%.
5. Run a generation → API still responds with 429.

**Expected Behavior:**
There should be a way to configure or automatically enforce the 125k per-request input token ceiling for Gemini free tier, so that single requests do not exceed Google’s quota.

**Actual Behavior:**
Neither rate limiting nor condensing prevents requests from exceeding 125k tokens. Free-tier Gemini 2.5 Pro is effectively unusable in Roo Code.

**Suggestion:**
1. Add a configurable max input tokens per request parameter in API profiles.
2. Ideally, combine this with rate limiting to handle both per-request and per-minute quota rules.

### Roo Code Task Links (Optional)

_No response_

### Request checklist

- [x] I've searched existing Issues and Discussions for duplicates
- [x] This describes a specific problem with clear impact and context

### Interested in implementing this?

- [ ] Yes, I'd like to help implement this feature

### Implementation requirements

- [ ] I understand this needs approval before implementation begins

### How should this be solved? (REQUIRED if contributing, optional otherwise)

**Suggestion:**
1. Add a configurable max input tokens per request parameter in API profiles.
2. Ideally, combine this with rate limiting to handle both per-request and per-minute quota rules.

### How will we know it works? (Acceptance Criteria - REQUIRED if contributing, optional otherwise)

_No response_

### Technical considerations (REQUIRED if contributing, optional otherwise)

_No response_

### Trade-offs and risks (REQUIRED if contributing, optional otherwise)

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

No way to enforce 125k input token limit for Gemini 2.5 Pro free tier (makes free tier unusable) #7853

What specific problem does this solve?

Additional context (optional)

Roo Code Task Links (Optional)

Request checklist

Interested in implementing this?

Implementation requirements

How should this be solved? (REQUIRED if contributing, optional otherwise)

How will we know it works? (Acceptance Criteria - REQUIRED if contributing, optional otherwise)

Technical considerations (REQUIRED if contributing, optional otherwise)

Trade-offs and risks (REQUIRED if contributing, optional otherwise)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

No way to enforce 125k input token limit for Gemini 2.5 Pro free tier (makes free tier unusable) #7853

Description

What specific problem does this solve?

Additional context (optional)

Roo Code Task Links (Optional)

Request checklist

Interested in implementing this?

Implementation requirements

How should this be solved? (REQUIRED if contributing, optional otherwise)

How will we know it works? (Acceptance Criteria - REQUIRED if contributing, optional otherwise)

Technical considerations (REQUIRED if contributing, optional otherwise)

Trade-offs and risks (REQUIRED if contributing, optional otherwise)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions