Skip to content

Conversation

@monkeyWie
Copy link
Member

Fetcher Refactor: Download Flow & Slow-Start Connections

Motivation

The previous Fetcher interface design had a critical limitation: the Resolve method did not accept Options as a parameter, making it impossible to determine the download directory during the resolve phase. This architectural constraint created several maintenance challenges:

  1. No connection reuse: HTTP fetchers couldn't reuse the initial resolve request's response body for actual downloading. Unlike modern browsers that start downloading immediately upon resolving metadata, Gopeed had to open a new connection in the Start phase, wasting the initial request.

  2. Directory determination mismatch: The download path could only be finalized after resolve completed, but the BT (BitTorrent) library required the storage path to be set during torrent spec creation. This forced us to maintain hacky workarounds to change the directory post-resolve.

  3. Inefficient resource utilization: The resolve connection was discarded entirely, even for single-file downloads where that connection could have downloaded the complete file.

By moving Options into the Resolve signature, we unify directory decision-making with resource acquisition, enabling immediate download initiation and eliminating the need for library patches.

Technical Changes

1. Resolve & Start Pipeline Redesign

HTTP Fetcher:

  • Resolve now accepts both Request and Options parameters, allowing immediate directory determination
  • The resolve phase uses a normal HTTP request (not HEAD/Range) and streams response data into a temporary file asynchronously
  • This resolve connection becomes a special "primary connection" that continues downloading after resolve returns
  • The primary connection only terminates when the first additional connection successfully connects, then that new connection takes over the remaining range
  • Start can be called before resolve completes; internally it waits for resolve to finish before launching additional connections

BT Fetcher:

  • Simplified TorrentSpec storage configuration since download path is known at resolve time
  • Removed custom directory-switching logic that was previously required
  • Cleaner integration with anacrolix/torrent library without modifications

2. Slow-Start Connection Expansion

Replaced fixed upfront connection spawning with gradual slow-start expansion:

  • Expansion pattern: Starts with 1 connection, then adds +1, +2, +4, +8... connections exponentially until reaching maxConnections
  • Gating condition: Next batch only starts after all connections in current batch receive successful HTTP responses (not just TCP handshake)
  • Auto-capping: Handles edge cases where next batch would exceed limit (e.g., max=5: 1→2→4→5, max=9: 1→2→4→8→9)
  • Benefits: Reduces burst load on servers, avoids triggering rate limits (429), and often completes downloads before reaching max connections

3. Dynamic Chunk Allocation & Helper Logic

  • Eliminated fixed chunk pre-allocation that caused last-1% slow tail problem
  • Connections dynamically help others when their chunk completes early
  • Helper connections only take chunks ≥1MB to avoid excessive fragmentation
  • Significantly reduces idle connections waiting for slow peers

4. Intelligent Retry & Error Handling

Error Classification:

  • Exempt errors (infinite retry): 429, 5xx, 408, 440, 499 - transient server issues
  • Counted errors (3 retry limit): 403, 404, 4xx - likely permanent failures
  • Non-HTTP errors (infinite retry): Network timeouts, connection resets

Enhanced Failure Reporting:

  • Tracks retryTimes per connection separately from failure count
  • Error messages include HTTP status code, retry count, and detailed messages
  • Users see clear diagnostic info: "connection 2 failed: retries=3, http code=403, msg=Forbidden"

Benefits

Faster Downloads

  • Downloads start immediately during the "resolve" phase - no more waiting for resolve to complete before seeing download progress
  • Like modern browsers, Gopeed now downloads while figuring out file details

Smoother Progress

  • No more frustrating "stuck at 99%" scenarios where one slow connection blocks completion
  • Downloads adapt intelligently - if a few connections are enough, it won't spawn unnecessary ones

Better Reliability

  • Failed connections retry automatically and intelligently
  • Temporary server errors (like "too many requests") won't stop your download
  • You'll rarely need to manually resume downloads anymore

Server-Friendly

  • Gradual connection ramp-up means less chance of being blocked by servers with connection limits
  • Downloads are fast but respectful of server resources

@monkeyWie monkeyWie added the enhancement New feature or request label Jan 8, 2026
@monkeyWie monkeyWie changed the title Refractor/resolve and create refractor: fethcer interface & download flow Jan 8, 2026
@monkeyWie monkeyWie changed the title refractor: fethcer interface & download flow refactor: fethcer interface & download flow Jan 8, 2026
- Applied commit 6756d82: Changed url.QueryUnescape to url.PathUnescape for handling %2B correctly
- Applied commit abd16d5: Added HTML entity decoding for filenames (handling & etc.)
- Resolved conflicts in fetcher.go by keeping refactored architecture
- Changes applied to internal/protocol/http/helper.go as functions moved during refactoring
- 添加 unescapeHTMLEntities 函数: 解码 & < 等 HTML 实体
- 添加 findParamValueEnd 函数: 正确处理带引号和 HTML 实体的参数值
- 添加 isValidHTMLEntityChars 函数: 验证 HTML 实体字符
- 更新 parseFilenameExtended: 使用 url.PathUnescape 替代 QueryUnescape 以正确处理 %2B
- 更新 parseFilenameFallback: 使用 findParamValueEnd 正确分割参数
- 更新 decodeFilenameParam: 先解码 HTML 实体,再使用 PathUnescape

这些改进确保文件名中的特殊字符(如 &, +)能被正确解析
@codecov
Copy link

codecov bot commented Jan 8, 2026

Codecov Report

❌ Patch coverage is 80.22356% with 230 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.74%. Comparing base (6756d82) to head (07bc510).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
internal/protocol/http/fetcher.go 77.41% 135 Missing and 33 partials ⚠️
internal/protocol/http/helper.go 82.01% 22 Missing and 12 partials ⚠️
internal/test/httptest.go 88.88% 8 Missing and 6 partials ⚠️
pkg/download/downloader.go 75.00% 5 Missing and 4 partials ⚠️
internal/protocol/bt/fetcher.go 72.22% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1229      +/-   ##
==========================================
+ Coverage   70.11%   71.74%   +1.62%     
==========================================
  Files          48       47       -1     
  Lines        5147     5712     +565     
==========================================
+ Hits         3609     4098     +489     
- Misses       1177     1222      +45     
- Partials      361      392      +31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants