Release GuideLLM v0.4.0 · vllm-project/guidellm

Overview

This release marks a significant milestone with a full architectural refactor of the GuideLLM codebase to improve extensibility, performance, and maintainability. Key highlights include multimodal benchmarking support (vision and audio), a new mock server for testing, and comprehensive updates to output generation and statistics gathering. Additionally, the minimum supported Python version has been bumped to 3.10 to leverage modern language features.

To get started, install with:

pip install guidellm==0.4.0

Or from source with:

pip install git+https://github.com/vllm-project/[email protected]

What's New

Multimodal Support: Added comprehensive support for vision and audio workloads, including audio transcription and translation benchmarking.
Full Refactor: Complete restructuring of core packages (backends, scheduler, benchmark, data) to support high-rate load generation and easier extensibility.
Mock Server: Introduced a built-in mock server package to facilitate testing and development without requiring a live LLM backend.
E2E Testing: Added a new End-to-End (E2E) testing workflow with a dedicated vLLM simulator.

What's Changed

Python Requirement: Minimum supported Python version bumped to 3.10 (previously 3.9).
CLI Arguments:
- Renamed -rate-type to -profile for clarity.
- Added support for dashed arguments (e.g., -max-seconds alongside max_seconds).
- -rates argument now supports comma-separated lists for easier sweeping.
Container: Updated Docker container to include ffmpeg-free and other utilities for multimodal support.
Data Pipelines: Reworked data pipelines to support complex multimodal datasets and better error propagation for HuggingFace loading.

What's Fixed

Synthetic Data: Fixed an issue where synthetic text datasets would lose randomness across benchmarks in the same session.
CSV Generation: Resolved failures in CSV output generation during benchmarks.
Asyncio Stability: Fixed various asyncio and timezone-related issues in tests and schedulers.
Type Safety: Extensive type fixes and improvements across the codebase, particularly in the scheduler and utils packages.

Compatibility Notes

Python: 3.10 – 3.13
OS: Linux and macOS
Dependencies:
- Added torchcodec
- Removed librosa, pydub, soundfile
- Development workflow now uses pdm and tox-pdm

New Contributors

@shijinye made their first contribution in PR #327
@git-jxj made their first contribution in PR #435
@AlonKellner-RedHat made their first contribution in PR #440

Changelog

Refactor & Core Architecture

PR #351: Full refactor of GuideLLM
PR #354: Scheduler package updates, rewrites, and tests expansion
PR #355: Backend package updates, rewrites, and tests expansion
PR #356: Benchmark package updates and rewrites
PR #357: Mock server package creation
PR #364: Core reintroduction of changes from main

Multimodal & Data

PR #384: Data pipelines rework and multimodal support
PR #419: Split multimodal group into vision and audio
PR #411: Replace librosa, pydub, and soundfile with torchcodec
PR #412: Fixes for constant rate and audio flows
PR #463: Ensure synthetic text datasets remain random across benchmarks

Features & Enhancements

PR #378: Complete CSV output
PR #432: Better scenario from-file support
PR #441: Support dashed arguments for benchmark args
PR #433: Switch --rates CLI arg to handle a comma separated list of values
PR #382: Advanced Prefix Cache Controls

Infrastructure & Quality

PR #397: Bump minimum python version to 3.10
PR #440: Basic E2E tests
PR #420: Adapt container for new optional requirements
PR #415: Add tox command to update lock file
PR #442: Updates and Fixes for benchmark outputs, schemas, and stats calculations

Bug Fixes

PR #435: Resolve CSV output generation failure in benchmarks
PR #413: Propagate valid failures from HuggingFace datasets loading
PR #405: Fixes for asyncio and timezone tests
PR #376: Edge case errors
PR #449: Fix failing settings tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GuideLLM v0.4.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Overview

What's New

What's Changed

What's Fixed

Compatibility Notes

New Contributors

Changelog

Refactor & Core Architecture

Multimodal & Data

Features & Enhancements

Infrastructure & Quality

Bug Fixes

Contributors

Uh oh!