-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: add embedding rate limiting to prevent API quota exhaustion For Issue #5713 #5715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add embedding rate limiting to prevent API quota exhaustion For Issue #5713 #5715
Conversation
- Add EMBEDDING_CALL_DELAY_MS constant (100ms) to prevent rate limiting - Implement delays after embedding calls in file-watcher.ts - Implement delays after embedding calls in scanner.ts - Helps prevent HTTP 429 errors during code indexing operations
|
Can 100ms be made configurable? |
|
Thanks for working on this rate limiting issue. I really appreciate the effort to address the HTTP 429 errors, but I had a few concerns about the current approach that I wanted to bring up. Right now, the delays are being added at the processor level ( There’s also a performance concern. These delays happen after every embedding call, even when no rate limiting is happening. That means indexing slows down for everyone, not just for users hitting provider limits. A more robust solution might be to handle this entirely within the embedder, based on the rate limiting info that providers return. For Gemini, that could look like:
That way, we only slow down when it's really needed, and it stays consistent with how other embedders deal with this. I'll close this PR but let me know if you have any questions. |
|
ok thanks - my test case is actually OpenAI with over 12,000 blocks to index. It stops at 6,600 and then it stops and fails so I don't see the retry logic kicking in. |
|
Hey @SatoshiReport I think we can improve the rate limiting behavior for each embedder individually so the implementation doesn't affect indexing speed unnecessarily for users who aren't hitting limits. |
Related GitHub Issue
Closes: #5713
Roo Code Task Context (Optional)
Description
This PR addresses HTTP 429 rate limiting errors during code indexing operations by implementing strategic delays after embedding API calls. The implementation adds a configurable
EMBEDDING_CALL_DELAY_MSconstant set to 100ms and applies delays in bothfile-watcher.tsandscanner.tsafter embedding calls are made. This prevents overwhelming the embedding service with rapid successive requests during bulk code indexing operations.Key implementation details:
Test Procedure
Manual Testing Steps:
Unit Testing:
Environment:
Pre-Submission Checklist
Screenshots / Videos
N/A - This is a backend performance optimization with no UI changes.
Documentation Updates
The changes are internal implementation details that don't affect the public API or user-facing functionality.
Additional Notes
The 100ms delay was chosen as a balance between preventing rate limiting and maintaining reasonable indexing performance. This value can be adjusted in the future based on embedding service performance characteristics and user feedback.
Get in Touch
Discord: satoshireport
Important
Adds a 100ms delay after embedding API calls in
file-watcher.tsandscanner.tsto prevent HTTP 429 rate limiting errors.EMBEDDING_CALL_DELAY_MSconstant (100ms) to delay embedding API calls, preventing HTTP 429 errors.processFile()infile-watcher.tsandprocessBatch()inscanner.tsafter embedding calls.EMBEDDING_CALL_DELAY_MStoindex.tsfor centralized delay configuration.async/awaitpatterns.This description was created by
for 70fa2e2. You can customize this summary. It will automatically update as commits are pushed.