Skip to content

Commit 165a9ec

Browse files
committed
Merge branch 'main' into v0.12-release
2 parents 1f0f925 + d1bf4c0 commit 165a9ec

File tree

7 files changed

+267
-253
lines changed

7 files changed

+267
-253
lines changed

docs/source/how-to-cache.mdx

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,46 @@ That `README.md` file is actually a symlink linking to the blob that has the has
8787
By creating the skeleton this way we open the mechanism to file sharing: if the same file was fetched in
8888
revision `bbbbbb`, it would have the same hash and the file would not need to be re-downloaded.
8989

90+
### .no_exist (advanced)
91+
92+
In addition to the `blobs`, `refs` and `snapshots` folders, you might also find a `.no_exist` folder
93+
in your cache. This folder keeps track of files that you've tried to download once but don't exist
94+
on the Hub. Its structure is the same as the `snapshots` folder with 1 subfolder per known revision:
95+
96+
```
97+
<CACHE_DIR>/<REPO_NAME>/.no_exist/aaaaaa/config_that_does_not_exist.json
98+
```
99+
100+
Unlike the `snapshots` folder, files are simple empty files (no symlinks). In this example,
101+
the file `"config_that_does_not_exist.json"` does not exist on the Hub for the revision `"aaaaaa"`.
102+
As it only stores empty files, this folder is neglectable is term of disk usage.
103+
104+
So now you might wonder, why is this information even relevant?
105+
In some cases, a framework tries to load optional files for a model. Saving the non-existence
106+
of optional files makes it faster to load a model as it saves 1 HTTP call per possible optional file.
107+
This is for example the case in `transformers` where each tokenizer can support additional files.
108+
The first time you load the tokenizer on your machine, it will cache which optional files exists (and
109+
which doesn't) to make the loading time faster for the next initializations.
110+
111+
To test if a file is cached locally (without making any HTTP request), you can use the [`try_to_load_from_cache`]
112+
helper. It will either return the filepath (if exists and cached), the object `_CACHED_NO_EXIST` (if non-existence
113+
is cached) or `None` (if we don't know).
114+
115+
```python
116+
from huggingface_hub import try_to_load_from_cache, _CACHED_NO_EXIST
117+
118+
filepath = try_to_load_from_cache()
119+
if isinstance(filepath, str):
120+
# file exists and is cached
121+
...
122+
elif filepath is _CACHED_NO_EXIST:
123+
# non-existence of file is cached
124+
...
125+
else:
126+
# file is not cached
127+
...
128+
```
129+
90130
### In practice
91131

92132
In practice, your cache should look like the following tree:

docs/source/package_reference/cache.mdx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,11 @@ for a detailed presentation of caching at HF.
66

77
## Helpers
88

9-
## cached_assets_path
9+
### try_to_load_from_cache
10+
11+
[[autodoc]] huggingface_hub.try_to_load_from_cache
12+
13+
### cached_assets_path
1014

1115
[[autodoc]] huggingface_hub.cached_assets_path
1216

docs/source/package_reference/hf_api.mdx

Lines changed: 32 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,92 +1,79 @@
1-
# Hugging Face Hub API
1+
# HfApi Client
22

3-
Below is the documentation for the `HfApi` class, which serves as a Python wrapper for the Hugging Face
4-
Hub's API.
3+
Below is the documentation for the `HfApi` class, which serves as a Python wrapper for the Hugging Face Hub's API.
54

6-
All methods from the `HfApi` are also accessible from the package's root directly, both approaches are detailed
7-
below.
5+
All methods from the `HfApi` are also accessible from the package's root directly. Both approaches are detailed below.
86

9-
The following approach uses the method from the root of the package:
7+
Using the root method is more straightforward but the [`HfApi`] class gives you more flexibility.
8+
In particular, you can pass a token that will be reused in all HTTP calls. This is different
9+
than `huggingface-cli login` or [`login`] as the token is not persisted on the machine.
10+
It is also possible to provide a different endpoint or configure a custom user-agent.
1011

1112
```python
12-
from huggingface_hub import list_models
13+
from huggingface_hub import HfApi, list_models
1314

15+
# Use root method
1416
models = list_models()
15-
```
16-
17-
The following approach uses the `HfApi` class:
18-
19-
```python
20-
from huggingface_hub import HfApi
21-
22-
hf_api = HfApi()
23-
models = hf_api.list_models()
24-
```
25-
26-
Using the [`HfApi`] class directly enables you to configure the client. In particular, a
27-
token can be passed to be authenticated in all API calls. This is different than
28-
`huggingface-cli login` or [`login`] as the token is not persisted on the machine. One
29-
can also specify a different endpoint than the Hugging Face's Hub (for example to interact
30-
with a Private Hub).
31-
32-
```py
33-
from huggingface_hub import HfApi
3417

18+
# Or configure a HfApi client
3519
hf_api = HfApi(
3620
endpoint="https://huggingface.co", # Can be a Private Hub endpoint.
3721
token="hf_xxx", # Token is not persisted on the machine.
3822
)
23+
models = hf_api.list_models()
3924
```
4025

41-
### HfApi
26+
## HfApi
4227

4328
[[autodoc]] HfApi
4429

45-
### RepoUrl
30+
## API Dataclasses
4631

47-
[[autodoc]] huggingface_hub.hf_api.RepoUrl
48-
49-
### ModelInfo
32+
### CommitInfo
5033

51-
[[autodoc]] huggingface_hub.hf_api.ModelInfo
34+
[[autodoc]] huggingface_hub.hf_api.CommitInfo
5235

5336
### DatasetInfo
5437

5538
[[autodoc]] huggingface_hub.hf_api.DatasetInfo
5639

57-
### SpaceInfo
58-
59-
[[autodoc]] huggingface_hub.hf_api.SpaceInfo
60-
61-
### RepoFile
40+
### GitRefInfo
6241

63-
[[autodoc]] huggingface_hub.hf_api.RepoFile
42+
[[autodoc]] huggingface_hub.hf_api.GitRefInfo
6443

6544
### GitRefs
6645

6746
[[autodoc]] huggingface_hub.hf_api.GitRefs
6847

69-
### GitRefInfo
48+
### ModelInfo
7049

71-
[[autodoc]] huggingface_hub.hf_api.GitRefInfo
50+
[[autodoc]] huggingface_hub.hf_api.ModelInfo
7251

73-
### CommitInfo
52+
### RepoFile
7453

75-
[[autodoc]] huggingface_hub.hf_api.CommitInfo
54+
[[autodoc]] huggingface_hub.hf_api.RepoFile
55+
56+
### RepoUrl
57+
58+
[[autodoc]] huggingface_hub.hf_api.RepoUrl
59+
60+
### SpaceInfo
61+
62+
[[autodoc]] huggingface_hub.hf_api.SpaceInfo
7663

7764
### UserLikes
7865

7966
[[autodoc]] huggingface_hub.hf_api.UserLikes
8067

81-
## `create_commit` API
68+
## CommitOperation
8269

8370
Below are the supported values for [`CommitOperation`]:
8471

8572
[[autodoc]] CommitOperationAdd
8673

8774
[[autodoc]] CommitOperationDelete
8875

89-
## Hugging Face local storage
76+
## Token helper
9077

9178
`huggingface_hub` stores the authentication information locally so that it may be re-used in subsequent
9279
methods.
@@ -95,7 +82,7 @@ It does this using the [`HfFolder`] utility, which saves data at the root of the
9582

9683
[[autodoc]] HfFolder
9784

98-
## Filtering helpers
85+
## Search helpers
9986

10087
Some helpers to filter repositories on the Hub are available in the `huggingface_hub` package.
10188

0 commit comments

Comments
 (0)