Skip to content

Conversation

@Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Mar 26, 2025

hanouticelina and others added 19 commits February 20, 2025 17:37
* first draft

* remove comment

* hf_xet instead of xet

* update docstring

* fix

* update docstring

* simplify typing

* quality

* add logging

* fix tests

* add unit tests for xet utilities

* first draft of download testing

* more tests

* address some comments

* fix tests

* check if hf_xet is available or not

* remove unnecessary dest dir creation

* keep comment

Co-authored-by: Lucain <[email protected]>

* post-review improvements

* Update tests/test_xet_download.py

---------

Co-authored-by: Lucain <[email protected]>
* add ability to enable/disable xet storage

* add test

* better way to check if all settings are none
* add upload workflow

* fixes and tests

* use helper for prgress bar

* use tmp repo in tests

* some fixes

* update tests

* mock HF_XET_CACHE

* fix tests

* fix utils tests

* debug CI

* fix

* check if xet is enabled

* debug CI

* debug CI again

* revert

* debugging

* don't rerun xet tests

* revert

* remove pytest timeout

* don't run tests in parallel

* add comment

* revert and rename variable

* don't skip tests

* remove warning

* fix tests

* Apply suggestions from code review

* fixes

* fix syntax error with python 3.8

* catch Invalid credentials

* fix

* record Space API VCR test

* use raise instead of raise e

Co-authored-by: Lucain <[email protected]>

* disable xet storage for the other tests

* reverting

* isolate xet tests for windows

* fix windows

* install hf_xet for xet testing

---------

Co-authored-by: Lucain <[email protected]>
Co-authored-by: Lucain Pouget <[email protected]>
* Xet docs
* PR feedback, added waitlist links
* Added HF_XET_CACHE env variable docs
* PR feedback
* Doc feedback
* Added two lines about flow of upload/download
* Updating links to Hub doc location
* Reformat headings, less levels in TOC

---------

Co-authored-by: Julien Chaumond <[email protected]>
Co-authored-by: Pierric Cistac <[email protected]>
Co-authored-by: Célina <[email protected]>
Co-authored-by: Lucain <[email protected]>
Directly calling hfxet.download_files() with token_refresher callback
   to ensure that hfxet calls the token refresher as expected.

---------

Co-authored-by: Celina Hanouti <[email protected]>
* Adding request header on resolve endpoint indicating that we can receive xet info.
* Adding test to ensure that the header is always sent on metdata request
* Using a two stage download path for xet files.
* Using the GET call's JSON
* Using xet_backed for the whether the file is a xet file or not to disambiguate from whether xet is enabled
* Adding and fixing tests
* Testing fix WIP
* Rewriting xet download to use the refresh route to resolve the xetmetadata
* Parameter type check
* Docs
* Removing extraneous constant
* Fixing file_download tests
* Readding the refresh route into the file metadata
* Refactoring the XetMetadata object into two objects to reflect the Hub changes.
* Fixing broken tests
* Code cleanup from self review
* Fixing types
* Quality & Lint
* Handling when hub returns the entire refresh route in its headers.
* Update tests/test_xet_utils.py
* Fixing merge conflicts in the new tests
* Extracting the refresh route from the link header (#2953)
* Getting the refresh route from the links header
* refactor xet_file_data func signature & tests

Co-authored-by: Lucain <[email protected]>
Co-authored-by: Rajat Arya <[email protected]>
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@hanouticelina hanouticelina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get this merged very soon, the test coverage looks good enough and hopefully we covered most of the scenarios for a first release of huggingface_hub with xet 🔥

Copy link
Contributor Author

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a last review and looks good to me for a release :) Awesome job everyone!

@hanouticelina hanouticelina merged commit aef0d8d into main Mar 27, 2025
22 checks passed
@hanouticelina hanouticelina deleted the xet-integration branch March 27, 2025 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants