How to integrate custom matrix-vector/matmul kernels? #15231

neel04 · 2025-08-11T09:08:38Z

neel04
Aug 11, 2025

So I've been poking around a few papers, and would love to integrate new CPU kernels especially for extreme-quantization scenarios.

Looking at some forks, it seems the common way to integrate your own kernel is to write a vector-dot product which hooks into llama.cpp and is then used for the corresponding matrix-vector product.

However, my algorithm wants to do a full matrix-vector or matrix-mul product to be efficient. Any pointers on how to integrate my own kernels here? I'm not too familiar with C++ or llama.cpp's huge codebase to answer this myself.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to integrate custom matrix-vector/matmul kernels? #15231

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How to integrate custom matrix-vector/matmul kernels? #15231

Uh oh!

neel04 Aug 11, 2025

Replies: 0 comments

neel04
Aug 11, 2025