I'm looking to mirror this change ggml-org/llama.cpp#5780 for Candle.
I have some experience with rust and have a repo that uses the same assembly instructions (see here) as the above PR but need some help/guidance integrating it with Candle.
- The license seems to be separate for the GEMV and GEMM specific code in llama.cpp, not sure what the best option here is. In llama.cpp I they kept it in a separate file with its own licence, see here.
- The GgmlType trait uses vec_dot and loops over to calculate the matmul. The assembly for these interweaved types directly runs the matmul in assembly (in llama.cpp), so there is no associated
vec_dot function to place into the GgmlType trait. Should I modify this trait or create another for these interweave types? What is the best way to handle this?