-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: implement dynamic retry timing for Google Gemini API #6682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Enhanced error handling in GeminiHandler to preserve original error structure - Added logic to distinguish between rate limits and quota exhaustion - Preserved retry delay information from Gemini API responses - Added new localized error messages for better user feedback - Added comprehensive tests for the new retry logic Fixes #6680
| } | ||
| } | ||
| } catch (error) { | ||
| } catch (error: any) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error handling logic for HTTP 429 is duplicated in both the createMessage and completePrompt methods. Consider extracting this logic into a shared helper function to reduce duplication and improve maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote this code and even I'm not sure why it works.
| } catch (error) { | ||
| } catch (error: any) { | ||
| // Preserve the original error structure for retry logic | ||
| if (error.status === 429) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error handling logic is duplicated between createMessage() and completePrompt(). Could we extract this into a private method like handleGeminiError(error: any): Error to follow DRY principles?
| if (error.status === 429) { | |
| private handleGeminiError(error: any): Error { | |
| // Preserve the original error structure for retry logic | |
| if (error.status === 429) { | |
| // Check if this is a rate limit or quota exhaustion | |
| const errorMessage = error.message || "" | |
| const isQuotaExhausted = errorMessage.includes("quota") || errorMessage.includes("billing") | |
| // Create an enhanced error that preserves the original structure | |
| const enhancedError = new Error( | |
| isQuotaExhausted | |
| ? t("common:errors.gemini.quota_exhausted", { error: errorMessage }) | |
| : t("common:errors.gemini.rate_limit", { error: errorMessage }), | |
| ) | |
| // Preserve ALL original error properties for retry logic | |
| Object.assign(enhancedError, error) | |
| throw enhancedError | |
| } | |
| if (error instanceof Error) { | |
| throw new Error(t("common:errors.gemini.generate_stream", { error: error.message })) | |
| } | |
| throw error | |
| } |
| // Preserve the original error properties for retry logic | ||
| Object.assign(enhancedError, { | ||
| status: error.status, | ||
| errorDetails: error.errorDetails, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional? The enhanced error only preserves status, errorDetails, and message. What if the original error has other properties that the retry logic depends on? Consider using Object.assign(enhancedError, error) to preserve all properties.
| "generate_complete_prompt": "Gemini completion error: {{error}}", | ||
| "sources": "Sources:" | ||
| "sources": "Sources:", | ||
| "rate_limit": "Gemini API rate limit exceeded. The system will automatically retry with the appropriate delay.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These new error messages are only added to the English locale. Could we add placeholder translations to all other locale files (ca, de, es, fr, etc.) to maintain consistency? Even if they're in English temporarily, it prevents runtime errors for non-English users.
| // Should throw before yielding any chunks | ||
| } | ||
| } catch (error: any) { | ||
| expect(error.status).toBe(429) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test verifies that error properties are preserved, but it doesn't check if the enhanced error message is correctly set. Could we add an assertion like:
| expect(error.status).toBe(429) | |
| expect(error.message).toContain("Gemini API rate limit exceeded") |
|
This doesn't seem quite right, closing for now until we scope the issue correctly |
This PR implements dynamic retry timing for the Google Gemini API based on the retry delay information returned in API responses. This addresses issue #6680 where users were experiencing unnecessary rate limit errors when the API provides specific retry timing information.
Changes:
Fixes #6680
Important
Implements dynamic retry timing for Google Gemini API in
GeminiHandler, enhancing error handling and adding tests for rate limit and quota exhaustion scenarios.GeminiHandlerfor Google Gemini API using retry delay from API responses.common.jsonfor rate limit and quota exhaustion.GeminiHandlerto preserve error details.gemini.spec.tsfor rate limit and quota exhaustion scenarios.This description was created by
for ea054e8. You can customize this summary. It will automatically update as commits are pushed.