Skip to content

Commit fe625c6

Browse files
authored
force patch_embd weights to f32
1 parent de56279 commit fe625c6

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

convert_hf_to_gguf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -302,7 +302,7 @@ def prepare_tensors(self):
302302
data_qtype: gguf.GGMLQuantizationType | bool = self.tensor_force_quant(name, new_name, bid, n_dims)
303303

304304
# Most of the codebase that takes in 1D tensors or norms only handles F32 tensors
305-
if n_dims <= 1 or new_name.endswith("_norm.weight"):
305+
if n_dims <= 1 or new_name.endswith("_norm.weight") or ".patch_embd.weight" in new_name:
306306
data_qtype = gguf.GGMLQuantizationType.F32
307307

308308
# Conditions should closely match those in llama_model_quantize_internal in llama.cpp

0 commit comments

Comments
 (0)