Skip to content
Thireus ☠ edited this page Mar 16, 2026 · 7 revisions

Welcome to the GGUF-Tool-Suite wiki!

This documentation aims to provide the essential steps to start using this Tool Suite which is comprised of 4 essential aspects:

  1. Produce recipe files - They contain the mix of quantized tensors that comprise the GGUF model.
  2. Download GGUF model shards - They are all the building blocks described in the GGUF model.
  3. Quantize your own shards - For advanced users who wish to DIY.
  4. Benchmark models - For the experts who wish to contribute to improve calibration data accuracy or add support to new models.

tl;dr: If you are in a hurry and want to obtain and test the GGUFs produced by this tool suite within minutes, follow these 3 steps.

  • Step 1 - 🍳 Automatically make a GGUF recipe sized perfectly for your hardware with quant_assign.html
  • Step 2 - ☁️ Download your GGUF using your recipe with quant_downloader.html
  • Step 3 - 🚀 Run anywhere - Use llama.cpp, ik_llama.cpp, or any GGUF-compatible framework.

Clone this wiki locally