Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
26bf2d3
add hf_xet as an optional dependency
hanouticelina Feb 20, 2025
ea7e34d
update installed packages at runtime
hanouticelina Feb 20, 2025
ec20fd9
Merge branch 'main' of github.com:huggingface/huggingface_hub into xe…
hanouticelina Feb 27, 2025
2be1a9d
split xet testing in CI
hanouticelina Feb 27, 2025
6f69dc4
Merge branch 'main' of github.com:huggingface/huggingface_hub into xe…
hanouticelina Feb 27, 2025
467483d
fix workflow
hanouticelina Feb 28, 2025
60543d8
fix windows
hanouticelina Feb 28, 2025
e5716f5
Xet download workflow (#2875)
hanouticelina Mar 4, 2025
d5f45c7
Merge branch 'main' of github.com:huggingface/huggingface_hub into xe…
hanouticelina Mar 5, 2025
52608e7
Add ability to enable/disable xet storage on a repo (#2893)
hanouticelina Mar 5, 2025
9389405
Merge branch 'main' of github.com:huggingface/huggingface_hub into xe…
hanouticelina Mar 7, 2025
552b663
don't strip authorization header with downloading with xet
hanouticelina Mar 12, 2025
2e645bc
Merge branch 'xet-integration' of github.com:huggingface/huggingface_…
hanouticelina Mar 12, 2025
28eaa45
update comment
hanouticelina Mar 12, 2025
86de575
Xet upload workflow (#2887)
hanouticelina Mar 12, 2025
e4e2a04
Xet Docs for huggingface_hub (#2899)
rajatarya Mar 17, 2025
f3efdd3
Adding Token Refresh Xet Test (#2932)
rajatarya Mar 24, 2025
494a6bf
Using a two stage download path for xet files. (#2920)
bpronan Mar 26, 2025
a2ec5ee
Merge branch 'main' into xet-integration
Wauplin Mar 26, 2025
439520e
Update src/huggingface_hub/constants.py
Wauplin Mar 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 39 additions & 6 deletions .github/workflows/python-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ jobs:
[
"Repository only",
"Everything else",
"Inference only"

"Inference only",
"Xet only"
]
include:
- python-version: "3.13" # LFS not ran on 3.8
Expand Down Expand Up @@ -95,6 +95,10 @@ jobs:
uv pip install "huggingface_hub[tensorflow-testing] @ ."
;;

"Xet only")
uv pip install "huggingface_hub[hf_xet] @ ."
;;

esac

# Run tests
Expand All @@ -121,7 +125,7 @@ jobs:
;;

"Everything else")
PYTEST="$PYTEST ../tests -k 'not TestRepository and not test_inference' -n 4"
PYTEST="$PYTEST ../tests -k 'not TestRepository and not test_inference and not test_xet' -n 4"
echo $PYTEST
eval $PYTEST
;;
Expand All @@ -130,7 +134,6 @@ jobs:
eval "RUN_GIT_LFS_TESTS=1 $PYTEST ../tests -k 'HfLargefilesTest'"
;;


fastai)
eval "$PYTEST ../tests/test_fastai*"
;;
Expand All @@ -147,6 +150,12 @@ jobs:
eval "$PYTEST ../tests/test_serialization.py"
;;

"Xet only")
PYTEST="$PYTEST ../tests -k 'test_xet' -n 4"
echo $PYTEST
eval $PYTEST
;;

esac

# Upload code coverage
Expand All @@ -169,6 +178,7 @@ jobs:
fail-fast: false
matrix:
python-version: ["3.8"]
test_name: ["Everything else", "Xet only"]

steps:
- uses: actions/checkout@v2
Expand All @@ -186,14 +196,37 @@ jobs:

# Install dependencies
- name: Install dependencies
run: uv pip install "huggingface_hub[testing] @ ."
run: |
uv pip install "huggingface_hub[testing] @ ."
if ("${{ matrix.test_name }}" -eq "Xet only") {
uv pip install "huggingface_hub[hf_xet] @ ."
}

# Run tests
- name: Run tests
working-directory: ./src # For code coverage to work
run: |
..\.venv\Scripts\activate
python -m pytest -n 4 --cov=./huggingface_hub --cov-report=xml:../coverage.xml --vcr-record=none --reruns 8 --reruns-delay 2 --only-rerun '(OSError|Timeout|HTTPError.*502|HTTPError.*504|not less than or equal to 0.01)' ../tests
$PYTEST_ARGS = @(
"-m", "pytest",
"-n", "4",
"--cov=./huggingface_hub",
"--cov-report=xml:../coverage.xml",
"--vcr-record=none",
"--reruns", "8",
"--reruns-delay", "2",
"--only-rerun", "(OSError|Timeout|HTTPError.*502|HTTPError.*504|not less than or equal to 0.01)",
"../tests"
)

switch ("${{ matrix.test_name }}") {
"Xet only" {
python $PYTEST_ARGS -k "test_xet"
}
"Everything else" {
python $PYTEST_ARGS -k "not test_xet"
}
}

# Upload code coverage
- name: Upload coverage reports to Codecov with GitHub Action
Expand Down
24 changes: 24 additions & 0 deletions docs/source/en/guides/download.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,30 @@ For more details about the CLI download command, please refer to the [CLI guide]

## Faster downloads

There are two options to speed up downloads. Both involve installing a Python package written in Rust.

* `hf_xet` is newer and uses the Xet storage backend for upload/download. It is available in production, but is in the process of being rolled out to all users, so join the [waitlist](https://huggingface.co/join/xet) to get onboarded soon!
* `hf_transfer` is a power-tool to download and upload to our LFS storage backend (note: this is less future-proof than Xet). It is thoroughly tested and has been in production for a long time, but it has some limitations.

### hf_xet

Take advantage of faster downloads through `hf_xet`, the Python binding to the [`xet-core`](https://github.com/huggingface/xet-core) library that enables
chunk-based deduplication for faster downloads and uploads. `hf_xet` integrates seamlessly with `huggingface_hub`, but uses the Rust `xet-core` library and Xet storage instead of LFS.

`hf_xet` uses the Xet storage system, which breaks files down into immutable chunks, storing collections of these chunks (called blocks or xorbs) remotely and retrieving them to reassemble the file when requested. When downloading, after confirming the user is authorized to access the files, `hf_xet` will query the Xet content-addressable service (CAS) with the LFS SHA256 hash for this file to receive the reconstruction metadata (ranges within xorbs) to assemble these files, along with presigned URLs to download the xorbs directly. Then `hf_xet` will efficiently download the xorb ranges necessary and will write out the files on disk. `hf_xet` uses a local disk cache to only download chunks once, learn more in the [Chunk-based caching(Xet)](./manage-cache.md#chunk-based-caching-xet) section.

To enable it, specify the `hf_xet` package when installing `huggingface_hub`:

```bash
pip install -U huggingface_hub[hf_xet]
```

Note: `hf_xet` will only be utilized when the files being downloaded are being stored with Xet Storage.

All other `huggingface_hub` APIs will continue to work without any modification. To learn more about the benefits of Xet storage and `hf_xet`, refer to this [section](https://huggingface.co/docs/hub/storage-backends).

### hf_transfer

If you are running on a machine with high bandwidth,
you can increase your download speed with [`hf_transfer`](https://github.com/huggingface/hf_transfer),
a Rust-based library developed to speed up file transfers with the Hub.
Expand Down
117 changes: 105 additions & 12 deletions docs/source/en/guides/manage-cache.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@
rendered properly in your Markdown viewer.
-->

# Manage `huggingface_hub` cache-system
# Understand caching

## Understand caching
`huggingface_hub` utilizes the local disk as two caches, which avoid re-downloading items again. The first cache is a file-based cache, which caches individual files downloaded from the Hub and ensures that the same file is not downloaded again when a repo gets updated. The second cache is a chunk cache, where each chunk represents a byte range from a file and ensures that chunks that are shared across files are only downloaded once.

## File-based caching

The Hugging Face Hub cache-system is designed to be the central cache shared across libraries
that depend on the Hub. It has been updated in v0.8.0 to prevent re-downloading same files
Expand Down Expand Up @@ -170,6 +172,95 @@ When symlinks are not supported, a warning message is displayed to the user to a
them they are using a degraded version of the cache-system. This warning can be disabled
by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable to true.

## Chunk-based caching (Xet)

To provide more efficient file transfers, `hf_xet` adds a `xet` directory to the existing `huggingface_hub` cache, creating additional caching layer to enable chunk-based deduplication. This cache holds chunks, which are immutable byte ranges from files (up to 64KB) that are created using content-defined chunking. For more information on the Xet Storage system, see this [section](https://huggingface.co/docs/hub/storage-backends).

The `xet` directory, located at `~/.cache/huggingface/xet` by default, contains two caches, utilized for uploads and downloads with the following structure

```bash
<CACHE_DIR>
├─ chunk_cache
├─ shard_cache
```

The `xet` cache, like the rest of `hf_xet` is fully integrated with `huggingface_hub`. If you use the existing APIs for interacting with cached assets, there is no need to update your workflow. The `xet` cache is built as an optimization layer on top of the existing `hf_xet` chunk-based deduplication and `huggingface_hub` cache system.

The `chunk-cache` directory contains cached data chunks that are used to speed up downloads while the `shard-cache` directory contains cached shards that are utilized on the upload path.

### `chunk_cache`

This cache is used on the download path. The cache directory structure is based on a base-64 encoded hash from the content-addressed store (CAS) that backs each Xet-enabled repository. A CAS hash serves as the key to lookup the offsets of where the data is stored.

At the topmost level, the first two letters of the base 64 encoded CAS hash are used to create a subdirectory in the `chunk_cache` (keys that share these first two letters are grouped here). The inner levels are comprised of subdirectories with the full key as the directory name. At the base are the cache items which are ranges of blocks that contain the cached chunks.

```bash
<CACHE_DIR>
├─ xet
│ ├─ chunk_cache
│ │ ├─ A1
│ │ │ ├─ A1GerURLUcISVivdseeoY1PnYifYkOaCCJ7V5Q9fjgxkZWZhdWx0
│ │ │ │ ├─ AAAAAAEAAAA5DQAAAAAAAIhRLjDI3SS5jYs4ysNKZiJy9XFI8CN7Ww0UyEA9KPD9
│ │ │ │ ├─ AQAAAAIAAABzngAAAAAAAPNqPjd5Zby5aBvabF7Z1itCx0ryMwoCnuQcDwq79jlB

```

When requesting a file, the first thing `hf_xet` does is communicate with Xet storage’s content addressed store (CAS) for reconstruction information. The reconstruction information contains information about the CAS keys required to download the file in its entirety.

Before executing the requests for the CAS keys, the `chunk_cache` is consulted. If a key in the cache matches a CAS key, then there is no reason to issue a request for that content. `hf_xet` uses the chunks stored in the directory instead.

As the `chunk_cache` is purely an optimization, not a guarantee, `hf_xet` utilizes a computationally efficient eviction policy. When the `chunk_cache` is full (see `Limits and Limitations` below), `hf_xet` implements a random eviction policy when selecting an eviction candidate. This significantly reduces the overhead of managing a robust caching system (e.g., LRU) while still providing most of the benefits of caching chunks.

### `shard_cache`

This cache is used when uploading content to the Hub. The directory is flat, comprising only of shard files, each using an ID for the shard name.

```sh
<CACHE_DIR>
├─ xet
│ ├─ shard_cache
│ │ ├─ 1fe4ffd5cf0c3375f1ef9aec5016cf773ccc5ca294293d3f92d92771dacfc15d.mdb
│ │ ├─ 906ee184dc1cd0615164a89ed64e8147b3fdccd1163d80d794c66814b3b09992.mdb
│ │ ├─ ceeeb7ea4cf6c0a8d395a2cf9c08871211fbbd17b9b5dc1005811845307e6b8f.mdb
│ │ ├─ e8535155b1b11ebd894c908e91a1e14e3461dddd1392695ddc90ae54a548d8b2.mdb
```

The `shard_cache` contains shards that are:

- Locally generated and successfully uploaded to the CAS
- Downloaded from CAS as part of the global deduplication algorithm

Shards provide a mapping between files and chunks. During uploads, each file is chunked and the hash of the chunk is saved. Every shard in the cache is then consulted. If a shard contains a chunk hash that is present in the local file being uploaded, then that chunk can be discarded as it is already stored in CAS.

All shards have an expiration date of 3-4 weeks from when they are downloaded. Shards that are expired are not loaded during upload and are deleted one week after expiration.

### Limits and Limitations

The `chunk_cache` is limited to 10GB in size while the `shard_cache` is technically without limits (in practice, the size and use of shards are such that limiting the cache is unnecessary).

By design, both caches are without high-level APIs. These caches are used primarily to facilitate the reconstruction (download) or upload of a file. To interact with the assets themselves, it’s recommended that you use the [`huggingface_hub` cache system APIs](https://huggingface.co/docs/huggingface_hub/guides/manage-cache).

If you need to reclaim the space utilized by either cache or need to debug any potential cache-related issues, simply remove the `xet` cache entirely by running `rm -rf ~/<cache_dir>/xet` where `<cache_dir>` is the location of your Hugging Face cache, typically `~/.cache/huggingface`

Example full `xet`cache directory tree:

```sh
<CACHE_DIR>
├─ xet
│ ├─ chunk_cache
│ │ ├─ L1
│ │ │ ├─ L1GerURLUcISVivdseeoY1PnYifYkOaCCJ7V5Q9fjgxkZWZhdWx0
│ │ │ │ ├─ AAAAAAEAAAA5DQAAAAAAAIhRLjDI3SS5jYs4ysNKZiJy9XFI8CN7Ww0UyEA9KPD9
│ │ │ │ ├─ AQAAAAIAAABzngAAAAAAAPNqPjd5Zby5aBvabF7Z1itCx0ryMwoCnuQcDwq79jlB
│ ├─ shard_cache
│ │ ├─ 1fe4ffd5cf0c3375f1ef9aec5016cf773ccc5ca294293d3f92d92771dacfc15d.mdb
│ │ ├─ 906ee184dc1cd0615164a89ed64e8147b3fdccd1163d80d794c66814b3b09992.mdb
│ │ ├─ ceeeb7ea4cf6c0a8d395a2cf9c08871211fbbd17b9b5dc1005811845307e6b8f.mdb
│ │ ├─ e8535155b1b11ebd894c908e91a1e14e3461dddd1392695ddc90ae54a548d8b2.mdb
```

To learn more about Xet Storage, see this [section](https://huggingface.co/docs/hub/storage-backends).

## Caching assets

In addition to caching files from the Hub, downstream libraries often requires to cache
Expand Down Expand Up @@ -232,15 +323,17 @@ In practice, your assets cache should look like the following tree:
└── (...)
```

## Scan your cache
## Manage your file-based cache

### Scan your cache

At the moment, cached files are never deleted from your local directory: when you download
a new revision of a branch, previous files are kept in case you need them again.
Therefore it can be useful to scan your cache directory in order to know which repos
and revisions are taking the most disk space. `huggingface_hub` provides an helper to
do so that can be used via `huggingface-cli` or in a python script.

### Scan cache from the terminal
**Scan cache from the terminal**

The easiest way to scan your HF cache-system is to use the `scan-cache` command from
`huggingface-cli` tool. This command scans the cache and prints a report with information
Expand Down Expand Up @@ -291,7 +384,7 @@ Done in 0.0s. Scanned 6 repo(s) for a total of 3.4G.
Got 1 warning(s) while scanning. Use -vvv to print details.
```

#### Grep example
**Grep example**

Since the output is in tabular format, you can combine it with any `grep`-like tools to
filter the entries. Here is an example to filter only revisions from the "t5-small"
Expand All @@ -304,7 +397,7 @@ t5-small model d0a119eedb3718e34c648e594394474cf95e0617
t5-small model d78aea13fa7ecd06c29e3e46195d6341255065d5 970.7M 9 1 week ago main /home/wauplin/.cache/huggingface/hub/models--t5-small/snapshots/d78aea13fa7ecd06c29e3e46195d6341255065d5
```

### Scan cache from Python
**Scan cache from Python**

For a more advanced usage, use [`scan_cache_dir`] which is the python utility called by
the CLI tool.
Expand Down Expand Up @@ -368,15 +461,15 @@ HFCacheInfo(
)
```

## Clean your cache
### Clean your cache

Scanning your cache is interesting but what you really want to do next is usually to
delete some portions to free up some space on your drive. This is possible using the
`delete-cache` CLI command. One can also programmatically use the
[`~HFCacheInfo.delete_revisions`] helper from [`HFCacheInfo`] object returned when
scanning the cache.

### Delete strategy
**Delete strategy**

To delete some cache, you need to pass a list of revisions to delete. The tool will
define a strategy to free up the space based on this list. It returns a
Expand Down Expand Up @@ -408,7 +501,7 @@ error is thrown. The deletion continues for other paths contained in the

</Tip>

### Clean cache from the terminal
**Clean cache from the terminal**

The easiest way to delete some revisions from your HF cache-system is to use the
`delete-cache` command from `huggingface-cli` tool. The command has two modes. By
Expand All @@ -417,7 +510,7 @@ revisions to delete. This TUI is currently in beta as it has not been tested on
platforms. If the TUI doesn't work on your machine, you can disable it using the
`--disable-tui` flag.

#### Using the TUI
**Using the TUI**

This is the default mode. To use it, you first need to install extra dependencies by
running the following command:
Expand Down Expand Up @@ -461,7 +554,7 @@ Start deletion.
Done. Deleted 1 repo(s) and 0 revision(s) for a total of 3.1G.
```

#### Without TUI
**Without TUI**

As mentioned above, the TUI mode is currently in beta and is optional. It may be the
case that it doesn't work on your machine or that you don't find it convenient.
Expand Down Expand Up @@ -522,7 +615,7 @@ Example of command file:
# 9cfa5647b32c0a30d0adfca06bf198d82192a0d1 # Refs: main # modified 5 days ago
```

### Clean cache from Python
**Clean cache from Python**

For more flexibility, you can also use the [`~HFCacheInfo.delete_revisions`] method
programmatically. Here is a simple example. See reference for details.
Expand Down
Loading