You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Merge with fork
Co-authored-by Guoqing Bao <[email protected]>
* Update sdpa
* Fix flash attn bf16 case
* Metal fixes
* Add metal methods
* Add new_private_buffer
* Fix metal tests
* Format
* Apply review comments
* Update CI (#3194)
* Update CI
* I have no clue what was going on with this maturin file, but I don't like it
* update cuda container options
* Add compute cap to cuda wf
* Fix rust toolchain call
* update cuda ci runner and bindgen_cuda
* Add initial support for imatrix quantization (#3193)
* add clear kv cache to quantized qwen3 weights (#3189)
* Fix metal bug
* Apply review comments
* Fix merge
* Add lld installation and test steps for Linux (#3213)
---------
Co-authored-by: ivarflakstad <[email protected]>
Co-authored-by: anonenity <[email protected]>
Co-authored-by: Nicolas PASCAL <[email protected]>
0 commit comments