Skip to content

CUDA: Hide latency of bias and gate-loading for fused mul_mat_vec_q#16847

Merged
am17an merged 1 commit intoggml-org:masterfrom
ORippler:osimons/prefetch_gate_bias_in_fused_mmvq
Oct 30, 2025
Merged

CUDA: Hide latency of bias and gate-loading for fused `mul_mat_vec_q`#16847
am17an merged 1 commit intoggml-org:masterfrom
ORippler:osimons/prefetch_gate_bias_in_fused_mmvq

Commits

Commits on Oct 29, 2025