You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Oct 25, 2024. It is now read-only.
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
220
220
outputs = model.generate(inputs)
221
221
```
222
-
You can also load GGUF format model from Huggingface, and we will use [NeuralSpeed](https://github.com/intel/neural-speed) to accelerate the inference on CPUs.
222
+
You can also load GGUF format model from Huggingface, we only support Q4_0 gguf format for now.
223
223
```python
224
224
from transformers import AutoTokenizer
225
225
from intel_extension_for_transformers.transformers import AutoModelForCausalLM
0 commit comments