You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: deprecate TRANSFORMERS_CACHE, use HF_HUB_CACHE everywhere (IBM#89)
#### Motivation
`TRANSFORMERS_CACHE` is deprecated (slated for removal with Transformers
v5) and `HUGGINGFACE_HUB_CACHE` is legacy. This PR standardizes on
`HF_HUB_CACHE` to configure the cache. Also, not all operations/CLI
commands were correctly pulling from `TRANSFORMERS_CACHE` so we have
been setting both env vars anyways. After this change, everything should
work with only `HF_HUB_CACHE`.
#### Modifications
- Launcher inspects HF_HUB_CACHE to determine the model cache path
- TRANSFORMERS_CACHE and HUGGINGFACE_HUB_CACHE are still checked as
well, but a deprecation warning is printed
- if multiple values are present and do not match, raise an error
- Launcher can resolve the default HF_HUB_CACHE so it does not need to
be set (HF_HOME or its default can be used instead)
- Server CLI checks TRANSFORMERS_CACHE and prints a warning if it is set
- Server CLI returns an error if both TRANSFORMERS_CACHE and
HF_HUB_CACHE are set with different values
---------
Signed-off-by: Travis Johnson <[email protected]>
Copy file name to clipboardExpand all lines: README.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -71,15 +71,15 @@ cd deployment
71
71
72
72
### Model configuration
73
73
74
-
When deploying TGIS, the `MODEL_NAME` environment variable can contain either the full name of a model on the Hugging Face hub (such as `google/flan-ul2`) or an absolute path to a (mounted) model directory inside the container. In the former case, the `TRANSFORMERS_CACHE` and `HUGGINGFACE_HUB_CACHE`environment variables should be set to the path of a mounted directory containing a local HF hub model cache, see [this](deployment/base/patches/pvcs/pvc.yaml) kustomize patch as an example.
74
+
When deploying TGIS, the `MODEL_NAME` environment variable can contain either the full name of a model on the Hugging Face hub (such as `google/flan-ul2`) or an absolute path to a (mounted) model directory inside the container. In the former case, the `HF_HUB_CACHE`environment variable should be set to the path of a mounted directory containing a local HF hub model cache, see [this](deployment/base/patches/pvcs/pvc.yaml) kustomize patch as an example.
75
75
76
76
### Downloading model weights
77
77
78
78
TGIS will not download model data at runtime. To populate the local HF hub cache with models so that it can be used per above, the image can be run with the following command:
where `model_name` is the name of the model on the HF hub. Ensure that it's run with the same mounted directory and `TRANSFORMERS_CACHE` and `HUGGINGFACE_HUB_CACHE` environment variables, and that it has write access to this mounted filesystem.
82
+
where `model_name` is the name of the model on the HF hub. Ensure that it's run with the same mounted directory and the `HF_HUB_CACHE` environment variable, and that it has write access to this mounted filesystem.
83
83
84
84
This will attempt to download weights in `.safetensors` format, and if those aren't in the HF hub will download pytorch `.bin` weights and then convert them to `.safetensors`.
0 commit comments