Skip to content

Conversation

@DePasqualeOrg
Copy link
Contributor

@DePasqualeOrg DePasqualeOrg commented Dec 26, 2025

This PR offers significant improvements in download and cache performance, and also brings the Swift Hub implementation closer to feature parity with the Python huggingface_hub library.

Changes

1. Skip HEAD requests for cached files

When downloading files that are already cached, we now skip the individual HEAD requests per file. The snapshot function fetches the current commit hash once via getRepoInfo, then passes it to each file download. If the local metadata shows the same commit hash, the file is returned immediately—no HEAD request needed to verify it's unchanged.

Python equivalent: file_download.py:1082-1095

2. Parallel file downloads

Files are now downloaded concurrently using a task group with a configurable number of concurrent downloads, matching the Python library's default of 8.

Python equivalent: _snapshot_download.py:449-455

3. Verify file integrity after download, skip re-hash on cache hit

LFS files (identified by SHA256 etags) are now verified after download. Previously, hash verification ran on every load in offline mode, adding ~200 ms+ for large files. Now we verify once at download time and trust the cache afterward.

Python equivalent: file_download.py:1394-1408

4. Size-weighted progress reporting

Progress is now weighted by file size instead of file count. This provides smoother, more accurate progress bars for downloads containing a mix of small config files and large model weights.

The getRepoInfo function fetches file sizes via the blobs=true API parameter (see hf_api.py:2617), and ProgressCoordinator uses these sizes as weights for each file's contribution to overall progress.

Benchmark Results

Tested with mlx-community/Qwen3-0.6B-Base-DQ5 (11 MB tokenizer.json).

Benchmark Before After Improvement
Cached file retrieval 782 ms 267 ms 2.9x faster
Offline mode cache hit 4.87 ms 0.14 ms 35x faster
Parallel downloads 1704 ms 742 ms 2.3x faster

Testing

  • Added unit tests for getRepoInfo, snapshot caching, and offline mode
  • Added HubBenchmarks.swift with reproducible performance tests. You can check out commit 49f7e1b to run the benchmarks before the changes in this PR, and then run them again with the latest commit in this PR to see the difference. These benchmarks can be deleted before merging or kept for testing future improvements.

@DePasqualeOrg
Copy link
Contributor Author

I've added the same optimizations to swift-huggingface in huggingface/swift-huggingface#21, which required porting some missing functionality from swift-transformers and huggingface_hub to swift-huggingface.

@DePasqualeOrg DePasqualeOrg force-pushed the optimizations branch 4 times, most recently from bc00570 to dd87e89 Compare December 26, 2025 22:17
@DePasqualeOrg DePasqualeOrg changed the title Optimize Hub download and cache performance Optimize download and cache performance Dec 27, 2025
@DePasqualeOrg DePasqualeOrg changed the title Optimize download and cache performance Optimizations for significantly faster downloads and cache hits Dec 27, 2025
@DePasqualeOrg
Copy link
Contributor Author

DePasqualeOrg commented Dec 27, 2025

After #304 is merged, I'll move the benchmark test to the separate Benchmark target that was added in that PR so that it doesn't run in CI.

@DePasqualeOrg DePasqualeOrg marked this pull request as draft January 5, 2026 12:45
@DePasqualeOrg DePasqualeOrg marked this pull request as ready for review January 5, 2026 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant