Optimizing vLLM Performance

Introduction

Course Title: Optimizing vLLM Performance

Description: This hands-on course provides a practical guide to tuning the vLLM engine for maximum efficiency on Red Hat OpenShift AI. You will learn to move beyond default settings to systematically optimize a deployed Large Language Model for a real-world chat application scenario. The course covers establishing a performance baseline with GuideLLM, iteratively tuning core engine parameters, and demonstrating the significant performance gains achieved by leveraging model quantization.

Duration: 2 hours

Objectives

On completing this course, you should be able to:

Establish a performance baseline for a deployed LLM using the GuideLLM benchmarking pipeline.
Systematically tune key vLLM parameters, such as max-model-len and max-num-seqs, and measure their impact on performance.
Deploy a quantized model and quantitatively compare its latency and throughput against a full-precision model.
Apply an iterative optimization methodology (measure, tune, validate) to improve resource utilization and reduce serving costs.

Prerequisites

This course assumes that you have the following prior experience:

Foundational knowledge of Large Language Models and vLLM serving concepts.
Completion of the "Model Performance Benchmarking with GuideLLM" course or equivalent experience.
Familiarity with using the OpenShift command-line (oc) and deploying applications with Helm.
Access to a Red Hat OpenShift AI cluster with an available GPU node.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
.vscode		.vscode
images		images
modules.bfx		modules.bfx
modules		modules
supplemental-ui/partials		supplemental-ui/partials
ui-assets		ui-assets
ui-bundle		ui-bundle
.gitignore		.gitignore
DEVSPACE.md		DEVSPACE.md
README-TRAINING.md		README-TRAINING.md
README.md		README.md
USAGEGUIDE.adoc		USAGEGUIDE.adoc
antora-playbook.yml		antora-playbook.yml
antora.yml		antora.yml
course-init.sh		course-init.sh
create-ui-bundle.sh		create-ui-bundle.sh
devfile.yaml		devfile.yaml
package-lock.json		package-lock.json
package.json		package.json
pdfgen.sh		pdfgen.sh
sample-image.png		sample-image.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Optimizing vLLM Performance

Introduction

Objectives

Prerequisites

About

Uh oh!

Releases

Packages

Languages

RedHatQuickCourses/llmops-vllm

Folders and files

Latest commit

History

Repository files navigation

Optimizing vLLM Performance

Introduction

Objectives

Prerequisites

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages