Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Sep 16, 2025

Summary

This PR addresses Issue #8012 by implementing proper handling of Google Gemini's rate limiting responses, including honoring the provider's suggested retry delays and distinguishing between temporary rate limits and quota exhaustion.

Changes

Enhanced Error Handling

  • Added GeminiError class to preserve structured error details from Gemini API
  • Enhanced error transformation in both streaming and non-streaming methods
  • Properly parse RetryInfo and QuotaFailure from error responses

Improved Retry Logic

  • Honor Gemini's retryDelay when present (e.g., "59s")
  • Add a 2-second buffer to provider-suggested delays for safety
  • Distinguish between temporary rate limiting and quota exhaustion
  • Stop retrying when daily/monthly quotas are exhausted

Better User Experience

  • Clear messaging for rate limit scenarios with countdown timers
  • Different messages for temporary limits vs quota exhaustion
  • Link to Gemini rate limit documentation for context
  • Non-intrusive retry behavior that avoids error popup spam

Testing

  • Added comprehensive test suite for rate limit handling
  • Tests cover RetryInfo parsing, QuotaFailure detection, and error transformation
  • All existing tests pass without regression

Acceptance Criteria Met

✅ Given a 429 with RetryInfo.retryDelay, the app waits the specified time (+ buffer) before retrying
✅ Given a 429 indicating quota exhaustion, the app does not retry and shows a clear message
✅ Given a 429 without RetryInfo, the app uses exponential backoff
✅ User-facing messages are concise and non-technical with links to documentation

Testing

  • Run new tests: cd src && npx vitest run api/providers/__tests__/gemini-rate-limit.spec.ts
  • Run existing Gemini tests: cd src && npx vitest run api/providers/__tests__/gemini.spec.ts
  • All tests pass ✅

Fixes #8012


Important

Improves Google Gemini rate limit handling by introducing structured error handling, enhanced retry logic, and clear user messaging, with comprehensive testing.

  • Error Handling:
    • Introduces GeminiError class in gemini.ts to handle structured error details from Gemini API.
    • Enhances error transformation in createMessage and completePrompt methods in gemini.ts.
    • Parses RetryInfo and QuotaFailure from error responses.
  • Retry Logic:
    • Implements retry logic in Task.ts to honor retryDelay from Gemini API, adding a 2-second buffer.
    • Differentiates between temporary rate limits and quota exhaustion.
    • Stops retrying when daily/monthly quotas are exhausted.
  • User Experience:
    • Provides clear messaging for rate limit scenarios with countdown timers in Task.ts.
    • Links to Gemini rate limit documentation for context.
  • Testing:
    • Adds gemini-rate-limit.spec.ts for comprehensive testing of rate limit handling.
    • Tests cover RetryInfo parsing, QuotaFailure detection, and error transformation.
  • Misc:
    • All existing tests pass without regression.

This description was created by Ellipsis for 125fa83. You can customize this summary. It will automatically update as commits are pushed.

- Enhanced GeminiHandler to properly parse 429 errors with RetryInfo and QuotaFailure
- Added GeminiError class to preserve structured error details
- Updated retry logic to respect provider-suggested delays with 2-second buffer
- Added distinction between temporary rate limiting and quota exhaustion
- Improved user messaging for rate limit scenarios
- Added comprehensive tests for rate limit handling

Fixes #8012
@roomote roomote bot requested review from cte, jr and mrubens as code owners September 16, 2025 03:22
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Sep 16, 2025
// Don't retry for quota exhaustion - show clear message and fail
await this.say(
"error",
`Gemini API quota exhausted. ${quotaFailure.violations?.[0]?.description || "Your daily or monthly quota has been exceeded."}\n\nPlease check your quota limits at: https://ai.google.dev/gemini-api/docs/rate-limits`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider wrapping user‐facing strings (e.g. the quota exhaustion message) in a translation function (t()) for consistency with internationalization practices.

This comment was generated because it violated a code review rule: irule_C0ez7Rji6ANcGkkX.

Copy link
Contributor Author

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing my own code because apparently I trust no one, not even myself.

if (match) {
exponentialDelay = Number(match[1]) + 1
// Add a small buffer (1-2 seconds) as recommended
exponentialDelay = Number(match[1]) + 2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the 2-second buffer intentional here? The issue mentions "1-2 seconds" but we're using a fixed 2-second buffer. Would it be worth making this configurable?

if (quotaFailure && !geminiRetryDetails?.retryDelay) {
// Check if the error message indicates daily/monthly quota exhaustion
const isQuotaExhausted =
error.message?.toLowerCase().includes("quota") &&
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice similar quota exhaustion checking logic here and at lines 2816-2819. Could we extract this into a helper method to reduce duplication?

{ input: "59s", expected: 59 },
{ input: "120s", expected: 120 },
{ input: "1s", expected: 1 },
{ input: "0s", expected: 0 },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great test coverage! Could we add a test case for when retryDelay is "0s" to ensure the buffer is still applied correctly in edge cases?

errorDetails?: Array<GeminiRetryInfo | GeminiQuotaFailure | any>
}

export class GeminiError extends Error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding JSDoc comments to document the purpose and structure of this error class. It would help future maintainers understand when and how to use it.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Sep 16, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Sep 16, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Sep 16, 2025
@daniel-lxs
Copy link
Member

Quota exhaustion detection may be unreliable. Logic relies on error.message substrings (“daily”/“monthly”/“exceeded”), which risks misclassification and unnecessary retries. Prefer structured signal: treat presence of QuotaFailure without RetryInfo as exhaustion.

@daniel-lxs daniel-lxs closed this Sep 22, 2025
@github-project-automation github-project-automation bot moved this from PR [Needs Prelim Review] to Done in Roo Code Roadmap Sep 22, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request PR - Needs Preliminary Review size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] Honor Gemini retryDelay; clarify rate-limit vs quota

4 participants