Model Quantization with LLM Compressor

Introduction

Course Title: Model Quantization with LLM Compressor

Description: This course provides a comprehensive, hands-on guide to model quantization, one of the most effective techniques for reducing the cost and memory footprint of Large Language Models. You will learn to use Red Hat's LLM Compressor toolkit to apply advanced compression techniques like W4A16, SmoothQuant, and GPTQ. The course covers both manual quantization within a Jupyter Notebook for deep understanding and the automation of the entire workflow using Kubeflow Pipelines for production-ready, repeatable results on OpenShift AI.

Duration: 2.5 hours

Objectives

On completing this course, you should be able to:

Understand the fundamentals of model quantization and its impact on cost, memory, and performance.
Manually quantize an LLM using LLM Compressor, SmoothQuant, and GPTQ in a Jupyter Notebook.
Build and execute an automated, end-to-end quantization workflow using Kubeflow Pipelines on OpenShift AI.
Articulate the business value of quantization and how it solves common enterprise challenges related to model size and deployment cost.

Prerequisites

This course assumes that you have the following prior experience:

Foundational knowledge of Large Language Models and model serving concepts.
Familiarity with using the OpenShift command-line (oc) and navigating the OpenShift AI dashboard.
Access to a Red Hat OpenShift AI cluster with an available GPU node and a configured pipeline server.
Basic experience working within a Jupyter Notebook environment.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
.vscode		.vscode
images		images
modules.bfx		modules.bfx
modules		modules
supplemental-ui/partials		supplemental-ui/partials
ui-assets		ui-assets
ui-bundle		ui-bundle
.gitignore		.gitignore
DEVSPACE.md		DEVSPACE.md
README-TRAINING.md		README-TRAINING.md
README.md		README.md
USAGEGUIDE.adoc		USAGEGUIDE.adoc
antora-playbook.yml		antora-playbook.yml
antora.yml		antora.yml
course-init.sh		course-init.sh
create-ui-bundle.sh		create-ui-bundle.sh
devfile.yaml		devfile.yaml
package-lock.json		package-lock.json
package.json		package.json
pdfgen.sh		pdfgen.sh
sample-image.png		sample-image.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Model Quantization with LLM Compressor

Introduction

Objectives

Prerequisites

About

Uh oh!

Releases

Packages

Languages

RedHatQuickCourses/llmops-quantize

Folders and files

Latest commit

History

Repository files navigation

Model Quantization with LLM Compressor

Introduction

Objectives

Prerequisites

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages