Feature request / 功能建议
Since the PR ggml-org/llama.cpp#9322 still has been settling, could you provide a Docker image for current version of https://github.com/OpenBMB/llama.cpp/tree/minicpm3?
I want to convert the model to .gguf
format and quantize it, and it can be cool if there's no need for compiling the llama.cpp
.
Or could you upload the quantized version in .gguf
format to HuggingFace?
Thanks in advance!