Skip to content

Exponential delay on retry should be optional/removed #850

@akmaldju

Description

@akmaldju

Which version of the app are you using?

v3.3.14

Which API Provider are you using?

Google Gemini

Which Model are you using?

gemini-2.0-flash

What happened?

I'm not sure what was the reason behind introducing exponentialDelay for API requests retry but it goes out of control for Gemini models. Gemini API has 2 cases when API request fail: either because user exceeded the quota per minute or uncontrollable shared quota. The shared quota is applied randomly to everyone when API is "busy" and is randomly removed every few seconds. When such case happens the delay grows from 5 to 40-80 seconds and it ends up just waiting for over a minute without even trying to call API. I don't think there's any API that requires this cooldown period anyways. This feature should either be optional, or removed imho, or at least have the cap of like 30-45 seconds max before the next retry.

Steps to reproduce

  1. Use free tier gemini models from Google AI Studio
  2. Try to develop features for a while until it happens

Relevant API REQUEST output

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue - Unassigned / ActionableClear and approved. Available for contributors to pick up.bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions