You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Workaround timm dependencies for readme examples.
Current synapse torch version is 2.10.0a0, which is lower than 2.10.0, which
is declared as required by shipped torchvision version. This in turn causes
torch reinstallation when pip dependency resolver is triggered, trying to install
timm, which depends on torchvision, in order to satisfy torchvision dependency.
Signed-off-by: Artur Kloniecki <arturx.kloniecki@intel.com>
* Remove deprecated text-generation README tests.
Signed-off-by: Artur Kloniecki <arturx.kloniecki@intel.com>
---------
Signed-off-by: Artur Kloniecki <arturx.kloniecki@intel.com>
> For multi-card usage, the number of cards loaded and used needs to be kept consistent with that when saving.
576
506
577
-
### Loading FP8 Checkpoints from Hugging Face
578
-
You can load pre-quantized FP8 models using the `--load_quantized_model_with_inc` argument. The `model_name_or_path` should be a model name from [Neural Magic](https://huggingface.co/collections/neuralmagic/fp8-llms-for-vllm-666742ed2b78b7ac8df13127) or a path to FP8 Checkpoints saved in Hugging Face format.
579
-
580
-
Below is an example of how to load `neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8` on two cards.
By default, the script runs the sample outlined in [BiomedCLIP-PubMedBERT_256-vit_base_patch16_224 notebook](https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224/blob/main/biomed_clip_example.ipynb). One can also can also run other OpenCLIP models by specifying model, classifier labels and image URL(s) like so:
0 commit comments