Skip to content

Commit 7097567

Browse files
committed
Force F32 compute in GLM4 ffn down
1 parent db52579 commit 7097567

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

src/llama-graph.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -803,6 +803,10 @@ ggml_tensor * llm_graph_context::build_ffn(
803803

804804
if (down) {
805805
cur = build_lora_mm(down, cur);
806+
if (arch == LLM_ARCH_GLM4) {
807+
// GLM4 seems to have precision issues in F16
808+
ggml_mul_mat_set_prec(cur, GGML_PREC_F32);
809+
}
806810
}
807811

808812
if (down_b) {

0 commit comments

Comments
 (0)