Skip to content

Improve mirror resilience, selection, and reduce boilerplate#37

Open
cmyui wants to merge 1 commit intomasterfrom
worktree-mirror-improvements
Open

Improve mirror resilience, selection, and reduce boilerplate#37
cmyui wants to merge 1 commit intomasterfrom
worktree-mirror-improvements

Conversation

@cmyui
Copy link
Member

@cmyui cmyui commented Mar 8, 2026

Summary

  • Retry transient errors: _fetch() retries 429/500/502/503 and network exceptions once with 1s backoff
  • Rate limit header awareness: reads x-ratelimit-remaining from upstream responses and drains the token bucket when remaining < 10, preventing rate limit trips
  • Exponential backoff on circuit breaker: cooldown doubles on each failed probe (30s → 60s → 120s → ... capped at 10min), resets on success — stops broken mirrors (e.g. DNS failures) from wasting hedging slots every 30s
  • Failure-aware mirror ordering: adds failure_ema to MirrorHealth and a score() method that penalizes unreliable mirrors (latency * (1 + 3 * failure_rate)), so a fast-but-flaky mirror ranks below a slower reliable one
  • Async wait in fallback path: wait_for_availability() briefly sleeps (up to 2s) for rate limiter token refill instead of skipping healthy-but-throttled mirrors
  • Deduplicate mirror backends: extracts common _fetch() base method + _extra_headers() hook, reducing ~220 lines of duplicated try/except/404/451 handling across Mino, OsuDirect, and Nerinyan
  • Response size limits: rejects responses over 100MB via Content-Length check before buffering
  • Remove dead code: get_mirror_weight, BeatmapMirrorScore, MIRROR_INITIAL_WEIGHT were never called

Test plan

  • Deploy to production and monitor mirror request logs for correct retry/fallback behavior
  • Verify broken Mino mirrors (mino-singapore, mino-germany) back off exponentially instead of probing every 30s
  • Confirm score-service beatmap lookup errors decrease

🤖 Generated with Claude Code

- Add retry with backoff for transient HTTP errors (429/500/502/503)
- Add rate limit header awareness (x-ratelimit-remaining backpressure)
- Add exponential backoff on circuit breaker cooldown (30s → 10min cap)
- Add failure rate EMA to mirror health, use composite score for ordering
- Add async wait_for_availability() in sequential fallback path
- Extract common _fetch() base method, reducing ~220 lines of duplication
- Add response size limits (100MB) for .osz downloads
- Remove unused get_mirror_weight/BeatmapMirrorScore dead code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cmyui cmyui requested a review from infernalfire72 as a code owner March 8, 2026 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant