-
Notifications
You must be signed in to change notification settings - Fork 161
Optimizations for significantly faster downloads and cache hits #302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
DePasqualeOrg
wants to merge
9
commits into
huggingface:main
Choose a base branch
from
DePasqualeOrg:optimizations
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+737
−253
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
63ab365 to
d877136
Compare
Contributor
Author
|
I've added the same optimizations to swift-huggingface in huggingface/swift-huggingface#21, which required porting some missing functionality from swift-transformers and huggingface_hub to swift-huggingface. |
bc00570 to
dd87e89
Compare
…wnload progress by file size
dd87e89 to
2cabd23
Compare
Contributor
Author
|
After #304 is merged, I'll move the benchmark test to the separate Benchmark target that was added in that PR so that it doesn't run in CI. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR offers significant improvements in download and cache performance, and also brings the Swift Hub implementation closer to feature parity with the Python huggingface_hub library.
Changes
1. Skip HEAD requests for cached files
When downloading files that are already cached, we now skip the individual HEAD requests per file. The
snapshotfunction fetches the current commit hash once viagetRepoInfo, then passes it to each file download. If the local metadata shows the same commit hash, the file is returned immediately—no HEAD request needed to verify it's unchanged.Python equivalent:
file_download.py:1082-10952. Parallel file downloads
Files are now downloaded concurrently using a task group with a configurable number of concurrent downloads, matching the Python library's default of 8.
Python equivalent:
_snapshot_download.py:449-4553. Verify file integrity after download, skip re-hash on cache hit
LFS files (identified by SHA256 etags) are now verified after download. Previously, hash verification ran on every load in offline mode, adding ~200 ms+ for large files. Now we verify once at download time and trust the cache afterward.
Python equivalent:
file_download.py:1394-14084. Size-weighted progress reporting
Progress is now weighted by file size instead of file count. This provides smoother, more accurate progress bars for downloads containing a mix of small config files and large model weights.
The
getRepoInfofunction fetches file sizes via theblobs=trueAPI parameter (seehf_api.py:2617), andProgressCoordinatoruses these sizes as weights for each file's contribution to overall progress.Benchmark Results
Tested with
mlx-community/Qwen3-0.6B-Base-DQ5(11 MB tokenizer.json).Testing
getRepoInfo, snapshot caching, and offline modeHubBenchmarks.swiftwith reproducible performance tests. You can check out commit 49f7e1b to run the benchmarks before the changes in this PR, and then run them again with the latest commit in this PR to see the difference. These benchmarks can be deleted before merging or kept for testing future improvements.