Finally something that makes it load gguf on old gpu.

Thank you very much for this!
Every llama.cpp I installed to run LLM on comfyui just loads the gguf on system RAM.. 
others that made custom llama.cpp versions to be compatible with cuda 12.8 and  python 13.2 always gave errors.
This works great and with the latest version, hopefully newer versions will not be broken too. 🙏