-
Notifications
You must be signed in to change notification settings - Fork 843
Xet integration #2958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xet integration #2958
Conversation
* first draft * remove comment * hf_xet instead of xet * update docstring * fix * update docstring * simplify typing * quality * add logging * fix tests * add unit tests for xet utilities * first draft of download testing * more tests * address some comments * fix tests * check if hf_xet is available or not * remove unnecessary dest dir creation * keep comment Co-authored-by: Lucain <[email protected]> * post-review improvements * Update tests/test_xet_download.py --------- Co-authored-by: Lucain <[email protected]>
* add ability to enable/disable xet storage * add test * better way to check if all settings are none
…hub into xet-integration
* add upload workflow * fixes and tests * use helper for prgress bar * use tmp repo in tests * some fixes * update tests * mock HF_XET_CACHE * fix tests * fix utils tests * debug CI * fix * check if xet is enabled * debug CI * debug CI again * revert * debugging * don't rerun xet tests * revert * remove pytest timeout * don't run tests in parallel * add comment * revert and rename variable * don't skip tests * remove warning * fix tests * Apply suggestions from code review * fixes * fix syntax error with python 3.8 * catch Invalid credentials * fix * record Space API VCR test * use raise instead of raise e Co-authored-by: Lucain <[email protected]> * disable xet storage for the other tests * reverting * isolate xet tests for windows * fix windows * install hf_xet for xet testing --------- Co-authored-by: Lucain <[email protected]> Co-authored-by: Lucain Pouget <[email protected]>
* Xet docs * PR feedback, added waitlist links * Added HF_XET_CACHE env variable docs * PR feedback * Doc feedback * Added two lines about flow of upload/download * Updating links to Hub doc location * Reformat headings, less levels in TOC --------- Co-authored-by: Julien Chaumond <[email protected]> Co-authored-by: Pierric Cistac <[email protected]> Co-authored-by: Célina <[email protected]> Co-authored-by: Lucain <[email protected]>
Directly calling hfxet.download_files() with token_refresher callback to ensure that hfxet calls the token refresher as expected. --------- Co-authored-by: Celina Hanouti <[email protected]>
* Adding request header on resolve endpoint indicating that we can receive xet info. * Adding test to ensure that the header is always sent on metdata request * Using a two stage download path for xet files. * Using the GET call's JSON * Using xet_backed for the whether the file is a xet file or not to disambiguate from whether xet is enabled * Adding and fixing tests * Testing fix WIP * Rewriting xet download to use the refresh route to resolve the xetmetadata * Parameter type check * Docs * Removing extraneous constant * Fixing file_download tests * Readding the refresh route into the file metadata * Refactoring the XetMetadata object into two objects to reflect the Hub changes. * Fixing broken tests * Code cleanup from self review * Fixing types * Quality & Lint * Handling when hub returns the entire refresh route in its headers. * Update tests/test_xet_utils.py * Fixing merge conflicts in the new tests * Extracting the refresh route from the link header (#2953) * Getting the refresh route from the links header * refactor xet_file_data func signature & tests Co-authored-by: Lucain <[email protected]> Co-authored-by: Rajat Arya <[email protected]>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
hanouticelina
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can get this merged very soon, the test coverage looks good enough and hopefully we covered most of the scenarios for a first release of huggingface_hub with xet 🔥
Co-authored-by: Célina <[email protected]>
Wauplin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a last review and looks good to me for a release :) Awesome job everyone!
Creating a PR thanks to the work of everyone involved in this branch, @bpronan @hanouticelina @rajatarya @assafvayner @coyotte508 to name a few. Let's have a final review on this PR before shipping! 🚀
Atomic PRs:
(private slack thread)