-
Notifications
You must be signed in to change notification settings - Fork 154
Closed
Description
What happened?
Trying to quantize the bf16 gguf and also using the imatrix file from here here: https://huggingface.co/bartowski/nvidia_Llama-3_3-Nemotron-Super-49B-v1_5-GGUF/tree/main
quantizing with the following command:
./llama-quantize --imatrix ../models/nvidia_Llama-3_3-Nemotron-Super-49B-v1_5-imatrix.gguf --allow-requantize --output-tensor-type q8_0 --token-embedding-type q8_0 ../models/nvidia_Llama-3_3-Nemotron-Super-49B-v1_5-bf16-00001-of-00003.gguf model-q4_k_l.gguf q4_k_m
fails with the error message:
load_imatrix: failed reading data for entry 1
quantization works when not using the imatrix flag
Name and Version
version: 1 (bb4c917)
built with MSVC 19.44.35213.0 for
What operating system are you seeing the problem on?
Windows
Relevant log output
load_imatrix: failed reading data for entry 1
Metadata
Metadata
Assignees
Labels
No labels