Skip to content

fix(usage-metrics): reduce rate-limit cascade errors#3622

Merged
mdelapenya merged 2 commits intotestcontainers:mainfrom
mdelapenya:fix-usage-metrics
Apr 1, 2026
Merged

fix(usage-metrics): reduce rate-limit cascade errors#3622
mdelapenya merged 2 commits intotestcontainers:mainfrom
mdelapenya:fix-usage-metrics

Conversation

@mdelapenya
Copy link
Copy Markdown
Member

@mdelapenya mdelapenya commented Apr 1, 2026

What does this PR do?

When a GitHub Code Search API call fails with a rate-limit error (403/429) within a pass, the next query now waits 65 seconds instead of the default 7 seconds, giving the rolling rate-limit window time to reset before continuing.

Extracts a dedicated isRateLimitError helper to distinguish rate-limit errors from other transient HTTP errors (500/502/503), so only actual rate-limit hits trigger the longer cooldown.

Why is it important?

Reduce rate-limit cascades in the usage-metrics collection workflow: after a 429, the next query fired just 7 seconds later, still inside the rate-limit window, causing it to also fail. With the in-pass cooldown, the collector can recover mid-pass instead of deferring all remaining failures to the next pass.

Related issues

@mdelapenya mdelapenya requested a review from a team as a code owner April 1, 2026 12:43
@netlify
Copy link
Copy Markdown

netlify bot commented Apr 1, 2026

Deploy Preview for testcontainers-go ready!

Name Link
🔨 Latest commit 34378aa
🔍 Latest deploy log https://app.netlify.com/projects/testcontainers-go/deploys/69cd13f9765db70008bf0194
😎 Deploy Preview https://deploy-preview-3622--testcontainers-go.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8c24e566-5624-4944-9fdc-d54c8e6f25eb

📥 Commits

Reviewing files that changed from the base of the PR and between 085f9ba and 34378aa.

📒 Files selected for processing (1)
  • usage-metrics/collect-metrics.go

Summary by CodeRabbit

  • Bug Fixes
    • Improved rate limiting error detection and recovery. The system now intelligently distinguishes between rate-limit-related errors and transient HTTP errors, with tailored handling strategies for each scenario to optimize recovery time and request flow.
    • Enhanced inter-request delay management during transient failures for improved reliability and request handling.

Walkthrough

Introduced rate-limit-specific detection and cooldown in the metrics collector: added rateLimitCooldown and a per-pass rateLimitHit flag, created isRateLimitError(err error) bool, and refactored isRetryableError(err error) bool to delegate rate-limit checks and retain transient HTTP retry logic (500/502/503).

Changes

Cohort / File(s) Summary
Rate-limit Handling
usage-metrics/collect-metrics.go
Added rateLimitCooldown (65s) and a per-pass rateLimitHit; introduced isRateLimitError(err error) bool; refactored isRetryableError(err error) bool to delegate rate-limit detection and limit retryable HTTP statuses to 500/502/503; adjusted per-pass inter-request sleep to use cooldown when rate-limited.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐇 I sniff the logs and find a pause,
A sixty-five second, gentle cause.
When 429s hop into view,
I wait, then try the queue anew.
Hooray for measured, patient runs!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: implementing rate-limit error handling to reduce cascading rate-limit failures in usage-metrics collection.
Description check ✅ Passed The PR description clearly explains the changes made: implementing a 65-second cooldown after rate-limit errors and extracting a dedicated rate-limit error helper function. It directly relates to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mdelapenya mdelapenya changed the title fix(usage-metrics): reduce rate-limit cascades fix(usage-metrics): reduce rate-limit cascade errors Apr 1, 2026
@mdelapenya mdelapenya self-assigned this Apr 1, 2026
@mdelapenya mdelapenya added the chore Changes that do not impact the existing functionality label Apr 1, 2026
@mdelapenya mdelapenya merged commit a34a6c9 into testcontainers:main Apr 1, 2026
14 checks passed
@mdelapenya mdelapenya deleted the fix-usage-metrics branch April 1, 2026 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore Changes that do not impact the existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant