You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ggml : block interleaving support for Q4_K quanti for AArch64
* new quanti: block_q4_kx4 with offline repack impl
* new quantize path: add NEON impl for ggml_quantize_mat_q8_K_4x8
* new gemv kernel: new ggml_gemv_q4_K_4x8_q8_K NEON kernel for GGML_OP_MUL_MAT_ID/GGML_OP_MUL_MAT
* new gemm kernel: new ggml_gemm_q4_K_4x8_q8_K NEON kernel for GGML_OP_MUL_MAT_ID/GGML_OP_MUL_MAT
* performance boost for both S_PP and S_TG
---------
Co-authored-by: yuanjia111 <[email protected]>
0 commit comments