@@ -87,6 +87,46 @@ That `README.md` file is actually a symlink linking to the blob that has the has
8787By creating the skeleton this way we open the mechanism to file sharing: if the same file was fetched in
8888revision ` bbbbbb ` , it would have the same hash and the file would not need to be re-downloaded.
8989
90+ ### .no_exist (advanced)
91+
92+ In addition to the ` blobs ` , ` refs ` and ` snapshots ` folders, you might also find a ` .no_exist ` folder
93+ in your cache. This folder keeps track of files that you've tried to download once but don't exist
94+ on the Hub. Its structure is the same as the ` snapshots ` folder with 1 subfolder per known revision:
95+
96+ ```
97+ <CACHE_DIR>/<REPO_NAME>/.no_exist/aaaaaa/config_that_does_not_exist.json
98+ ```
99+
100+ Unlike the ` snapshots ` folder, files are simple empty files (no symlinks). In this example,
101+ the file ` "config_that_does_not_exist.json" ` does not exist on the Hub for the revision ` "aaaaaa" ` .
102+ As it only stores empty files, this folder is neglectable is term of disk usage.
103+
104+ So now you might wonder, why is this information even relevant?
105+ In some cases, a framework tries to load optional files for a model. Saving the non-existence
106+ of optional files makes it faster to load a model as it saves 1 HTTP call per possible optional file.
107+ This is for example the case in ` transformers ` where each tokenizer can support additional files.
108+ The first time you load the tokenizer on your machine, it will cache which optional files exists (and
109+ which doesn't) to make the loading time faster for the next initializations.
110+
111+ To test if a file is cached locally (without making any HTTP request), you can use the [ ` try_to_load_from_cache ` ]
112+ helper. It will either return the filepath (if exists and cached), the object ` _CACHED_NO_EXIST ` (if non-existence
113+ is cached) or ` None ` (if we don't know).
114+
115+ ``` python
116+ from huggingface_hub import try_to_load_from_cache, _CACHED_NO_EXIST
117+
118+ filepath = try_to_load_from_cache()
119+ if isinstance (filepath, str ):
120+ # file exists and is cached
121+ ...
122+ elif filepath is _CACHED_NO_EXIST :
123+ # non-existence of file is cached
124+ ...
125+ else :
126+ # file is not cached
127+ ...
128+ ```
129+
90130### In practice
91131
92132In practice, your cache should look like the following tree:
0 commit comments