Skip to content

Releases: peter277/llama.cpp

b6121

09 Aug 04:24
e54d41b

Choose a tag to compare

gguf-py : add Numpy MXFP4 de/quantization support (#15111)

* gguf-py : add MXFP4 de/quantization support

* ggml-quants : handle zero amax for MXFP4