Home

Jump to bottom

Thireus ☠ edited this page Mar 16, 2026 · 7 revisions

Welcome to the GGUF-Tool-Suite wiki!

This documentation aims to provide the essential steps to start using this Tool Suite which is comprised of 4 essential aspects:

Produce recipe files - They contain the mix of quantized tensors that comprise the GGUF model.
Download GGUF model shards - They are all the building blocks described in the GGUF model.
Quantize your own shards - For advanced users who wish to DIY.
Benchmark models - For the experts who wish to contribute to improve calibration data accuracy or add support to new models.

tl;dr: If you are in a hurry and want to obtain and test the GGUFs produced by this tool suite within minutes, follow these 3 steps.

Step 1 - 🍳 Automatically make a GGUF recipe sized perfectly for your hardware with quant_assign.html
Step 2 - ☁️ Download your GGUF using your recipe with quant_downloader.html
Step 3 - 🚀 Run anywhere - Use llama.cpp, ik_llama.cpp, or any GGUF-compatible framework.