feat: add embedding rate limiting to prevent API quota exhaustion For Issue #5713 #5715

SatoshiReport · 2025-07-15T03:03:46Z

Add EMBEDDING_CALL_DELAY_MS constant (100ms) to prevent rate limiting
Implement delays after embedding calls in file-watcher.ts
Implement delays after embedding calls in scanner.ts
Helps prevent HTTP 429 errors during code indexing operations

Related GitHub Issue

Closes: #5713

Roo Code Task Context (Optional)

Description

This PR addresses HTTP 429 rate limiting errors during code indexing operations by implementing strategic delays after embedding API calls. The implementation adds a configurable EMBEDDING_CALL_DELAY_MS constant set to 100ms and applies delays in both file-watcher.ts and scanner.ts after embedding calls are made. This prevents overwhelming the embedding service with rapid successive requests during bulk code indexing operations.

Key implementation details:

Added centralized delay constant for consistent timing across components
Implemented non-blocking delays using appropriate async/await patterns
Focused on embedding-specific calls to minimize impact on other operations

Test Procedure

Manual Testing Steps:

Trigger a bulk code indexing operation (e.g., opening a large project or running full codebase scan)
Monitor network requests to embedding service for rate limiting errors (HTTP 429)
Verify that delays are properly applied after embedding calls
Confirm that indexing operations complete successfully without rate limit errors

Unit Testing:

Added tests to verify delay implementation in both file-watcher.ts and scanner.ts
Verified that EMBEDDING_CALL_DELAY_MS constant is properly utilized
Tested that delays don't interfere with normal operation flow

Environment:

Test with projects containing 100+ files to simulate bulk indexing scenarios
Monitor embedding service response times and error rates

Pre-Submission Checklist

Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
Scope: My changes are focused on the linked issue (one major feature/fix per PR).
Self-Review: I have performed a thorough self-review of my code.
Testing: New and/or updated tests have been added to cover my changes (if applicable).
Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

N/A - This is a backend performance optimization with no UI changes.

Documentation Updates

No documentation updates are required.

The changes are internal implementation details that don't affect the public API or user-facing functionality.

Additional Notes

The 100ms delay was chosen as a balance between preventing rate limiting and maintaining reasonable indexing performance. This value can be adjusted in the future based on embedding service performance characteristics and user feedback.

Get in Touch

Discord: satoshireport

Important

Adds a 100ms delay after embedding API calls in file-watcher.ts and scanner.ts to prevent HTTP 429 rate limiting errors.

Behavior:
- Introduces EMBEDDING_CALL_DELAY_MS constant (100ms) to delay embedding API calls, preventing HTTP 429 errors.
- Implements delay in processFile() in file-watcher.ts and processBatch() in scanner.ts after embedding calls.
Constants:
- Adds EMBEDDING_CALL_DELAY_MS to index.ts for centralized delay configuration.
Misc:
- Ensures non-blocking delays using async/await patterns.

^{This description was created by}^{for 70fa2e2. You can customize this summary. It will automatically update as commits are pushed.}

- Add EMBEDDING_CALL_DELAY_MS constant (100ms) to prevent rate limiting - Implement delays after embedding calls in file-watcher.ts - Implement delays after embedding calls in scanner.ts - Helps prevent HTTP 429 errors during code indexing operations

jax-max · 2025-07-15T06:30:17Z

Can 100ms be made configurable?

daniel-lxs · 2025-07-15T14:19:19Z

Thanks for working on this rate limiting issue. I really appreciate the effort to address the HTTP 429 errors, but I had a few concerns about the current approach that I wanted to bring up.

Right now, the delays are being added at the processor level (file-watcher.ts and scanner.ts), but we already have retry logic built into the embedder implementations. For example, the Gemini embedder uses OpenAICompatibleEmbedder, which already handles 429s with exponential backoff (around lines 301–312 in openai-compatible.ts). Adding another layer of fixed delays on top of that seems redundant and a bit inconsistent with the architecture.

There’s also a performance concern. These delays happen after every embedding call, even when no rate limiting is happening. That means indexing slows down for everyone, not just for users hitting provider limits.

A more robust solution might be to handle this entirely within the embedder, based on the rate limiting info that providers return. For Gemini, that could look like:

Catching 429 responses
Checking the headers or payload for rate limit details (like RPM or tokens per minute)
Using the Retry-After header if it’s present
Falling back to exponential backoff based on provider guidance

That way, we only slow down when it's really needed, and it stays consistent with how other embedders deal with this.

I'll close this PR but let me know if you have any questions.

SatoshiReport · 2025-07-15T14:38:40Z

ok thanks - my test case is actually OpenAI with over 12,000 blocks to index. It stops at 6,600 and then it stops and fails so I don't see the retry logic kicking in.

daniel-lxs · 2025-07-15T15:56:08Z

Hey @SatoshiReport
I understand where you're coming from. I haven’t personally run into this issue, so slowing down indexing for everyone doesn't seem like the best idea.

I think we can improve the rate limiting behavior for each embedder individually so the implementation doesn't affect indexing speed unnecessarily for users who aren't hitting limits.

SatoshiReport requested review from cte, jr and mrubens as code owners July 15, 2025 03:03

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Jul 15, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Jul 15, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Jul 15, 2025

dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Jul 15, 2025

SatoshiReport changed the title ~~feat: add embedding rate limiting to prevent API quota exhaustion~~ feat: add embedding rate limiting to prevent API quota exhaustion For Issue #5714 Jul 15, 2025

dosubot bot added the enhancement New feature or request label Jul 15, 2025

SatoshiReport mentioned this pull request Jul 15, 2025

Gemini-embedding-001 not respecting free quota limits #5713

Closed

SatoshiReport changed the title ~~feat: add embedding rate limiting to prevent API quota exhaustion For Issue #5714~~ feat: add embedding rate limiting to prevent API quota exhaustion For Issue #5713 Jul 15, 2025

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jul 15, 2025

daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jul 15, 2025

hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Jul 15, 2025

daniel-lxs closed this Jul 15, 2025

github-project-automation bot moved this from PR [Needs Prelim Review] to Done in Roo Code Roadmap Jul 15, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Jul 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add embedding rate limiting to prevent API quota exhaustion For Issue #5713 #5715

feat: add embedding rate limiting to prevent API quota exhaustion For Issue #5713 #5715

Uh oh!

SatoshiReport commented Jul 15, 2025 •

edited

Loading

Uh oh!

jax-max commented Jul 15, 2025

Uh oh!

daniel-lxs commented Jul 15, 2025

Uh oh!

SatoshiReport commented Jul 15, 2025

Uh oh!

daniel-lxs commented Jul 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: add embedding rate limiting to prevent API quota exhaustion For Issue #5713 #5715

feat: add embedding rate limiting to prevent API quota exhaustion For Issue #5713 #5715

Uh oh!

Conversation

SatoshiReport commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related GitHub Issue

Roo Code Task Context (Optional)

Description

Test Procedure

Pre-Submission Checklist

Screenshots / Videos

Documentation Updates

Additional Notes

Get in Touch

Uh oh!

jax-max commented Jul 15, 2025

Uh oh!

daniel-lxs commented Jul 15, 2025

Uh oh!

SatoshiReport commented Jul 15, 2025

Uh oh!

daniel-lxs commented Jul 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SatoshiReport commented Jul 15, 2025 •

edited

Loading