This is the repository for the LinkedIn Learning course Advanced Quantization Techniques for Large Language Models. The full course is available from LinkedIn Learning.
Discover cutting-edge quantization techniques for large language models, focusing on the algorithms and optimization strategies that deliver the best performance. Instructor Nayan Saxena begins by covering mathematical foundations, before progressing through advanced methods including GPTQ, AWQ, and SmoothQuant with hands-on examples in Google Colab. Along the way, gather quick tips to master critical concepts such as precision formats, calibration strategies, and evaluation methodologies. Leveraging both theoretical principles and practical applications, this course equips you with in-demand skills to significantly reduce model size and accelerate inference while maintaining performance quality.
Learning objectives:
- Analyze the mathematical foundations of quantization and their impact on transformer architectures.
- Apply state-of-the-art quantization techniques including GPTQ, AWQ, and SmoothQuant to LLMs.
- Evaluate the trade-offs between different quantization approaches using appropriate metrics.
- Optimize quantization results through advanced calibration strategies.
- Compare and select quantization methods based on model architecture and use case requirements.
Nayan Saxena
Deep Learning Expert
Check out my other courses on LinkedIn Learning.