Feature Request: Standalone Notebook for Unsloth Dynamic Quant 2.0 #4033
Replies: 4 comments
-
|
We don't have any notebook performing dyanmic quantization. You can however perform quantization aware training QAT: https://unsloth.ai/docs/blog/quantization-aware-training-qat |
Beta Was this translation helpful? Give feedback.
-
|
Thank you for the clarification! |
Beta Was this translation helpful? Give feedback.
-
|
Strong +1 for a standalone Dynamic Quant 2.0 notebook! Why this matters:
What I'd love to see included: # 1. Model loading with quant config
model = FastModel.from_pretrained(
model_name,
dynamic_quant_config={
"bits": 4,
"strategy": "dynamic",
"calibration_samples": 128
}
)
# 2. Calibration step (visualized)
calibration_stats = model.calibrate(dataset)
plot_quantization_ranges(calibration_stats)
# 3. Evaluation before/after
print(f"Original perplexity: {original_ppl}")
print(f"Quantized perplexity: {quant_ppl}")
print(f"Size reduction: {size_reduction}%")We do a lot of quantization work at RevolutionAI for edge deployments. A well-documented notebook would be incredibly useful for client demos! Happy to help test or contribute examples if this moves forward. |
Beta Was this translation helpful? Give feedback.
-
|
+1 for standalone Dynamic Quant 2.0 notebook! Use cases this would enable:
Proposed notebook structure: # Cell 1: Setup
!pip install unsloth
# Cell 2: Load model (any HF model)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
"meta-llama/Llama-3-8B-Instruct"
)
# Cell 3: Apply Dynamic Quant 2.0
model = FastLanguageModel.dynamic_quant_2(
model,
bits=4,
calibration_data=calibration_dataset,
)
# Cell 4: Save/Upload
model.save_pretrained("llama-3-8b-dq2")
model.push_to_hub("username/llama-3-8b-dq2")
# Cell 5: Test inference
output = model.generate(...)Colab considerations:
We quantize models regularly at Revolution AI — a standalone notebook would streamline our pipeline significantly. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello Unsloth team,
I would like to request a standalone notebook specifically for Unsloth Dynamic Quant 2.0.
Currently, Dynamic Quant 2.0 is sometimes included within fine-tuning notebooks, but there does not appear to be a dedicated notebook focused solely on performing Dynamic Quant 2.0 quantization.
Having a standalone notebook would be very helpful for:
Users who only want to quantize existing Hugging Face models
Re-quantizing merged or fine-tuned models
Running Dynamic Quant 2.0 independently without going through the full fine-tuning pipeline
Clearer educational reference for how Dynamic Quant 2.0 works in isolation
A Google Colab–ready version would be especially valuable for accessibility.
Dynamic Quant 2.0 is a very powerful feature, and I believe a dedicated notebook would improve usability and adoption.
Thank you for your great work on Unsloth.
Beta Was this translation helpful? Give feedback.
All reactions