Skip to content

GuideLLM v0.4.0

Latest

Choose a tag to compare

@markurtz markurtz released this 21 Nov 22:29
· 41 commits to main since this release

Overview

This release marks a significant milestone with a full architectural refactor of the GuideLLM codebase to improve extensibility, performance, and maintainability. Key highlights include multimodal benchmarking support (vision and audio), a new mock server for testing, and comprehensive updates to output generation and statistics gathering. Additionally, the minimum supported Python version has been bumped to 3.10 to leverage modern language features.

To get started, install with:

pip install guidellm==0.4.0

Or from source with:

pip install git+https://github.com/vllm-project/[email protected]

What's New

  • Multimodal Support: Added comprehensive support for vision and audio workloads, including audio transcription and translation benchmarking.
  • Full Refactor: Complete restructuring of core packages (backendsschedulerbenchmarkdata) to support high-rate load generation and easier extensibility.
  • Mock Server: Introduced a built-in mock server package to facilitate testing and development without requiring a live LLM backend.
  • E2E Testing: Added a new End-to-End (E2E) testing workflow with a dedicated vLLM simulator.

What's Changed

  • Python Requirement: Minimum supported Python version bumped to 3.10 (previously 3.9).
  • CLI Arguments:
    • Renamed -rate-type to -profile for clarity.
    • Added support for dashed arguments (e.g., -max-seconds alongside max_seconds).
    • -rates argument now supports comma-separated lists for easier sweeping.
  • Container: Updated Docker container to include ffmpeg-free and other utilities for multimodal support.
  • Data Pipelines: Reworked data pipelines to support complex multimodal datasets and better error propagation for HuggingFace loading.

What's Fixed

  • Synthetic Data: Fixed an issue where synthetic text datasets would lose randomness across benchmarks in the same session.
  • CSV Generation: Resolved failures in CSV output generation during benchmarks.
  • Asyncio Stability: Fixed various asyncio and timezone-related issues in tests and schedulers.
  • Type Safety: Extensive type fixes and improvements across the codebase, particularly in the scheduler and utils packages.

Compatibility Notes

  • Python: 3.10 – 3.13
  • OS: Linux and macOS
  • Dependencies:
    • Added torchcodec
    • Removed librosapydubsoundfile
    • Development workflow now uses pdm and tox-pdm

New Contributors

Changelog

Refactor & Core Architecture

  • PR #351: Full refactor of GuideLLM
  • PR #354: Scheduler package updates, rewrites, and tests expansion
  • PR #355: Backend package updates, rewrites, and tests expansion
  • PR #356: Benchmark package updates and rewrites
  • PR #357: Mock server package creation
  • PR #364: Core reintroduction of changes from main

Multimodal & Data

  • PR #384: Data pipelines rework and multimodal support
  • PR #419: Split multimodal group into vision and audio
  • PR #411: Replace librosa, pydub, and soundfile with torchcodec
  • PR #412: Fixes for constant rate and audio flows
  • PR #463: Ensure synthetic text datasets remain random across benchmarks

Features & Enhancements

  • PR #378: Complete CSV output
  • PR #432: Better scenario from-file support
  • PR #441: Support dashed arguments for benchmark args
  • PR #433: Switch --rates CLI arg to handle a comma separated list of values
  • PR #382: Advanced Prefix Cache Controls

Infrastructure & Quality

  • PR #397: Bump minimum python version to 3.10
  • PR #440: Basic E2E tests
  • PR #420: Adapt container for new optional requirements
  • PR #415: Add tox command to update lock file
  • PR #442: Updates and Fixes for benchmark outputs, schemas, and stats calculations

Bug Fixes

  • PR #435: Resolve CSV output generation failure in benchmarks
  • PR #413: Propagate valid failures from HuggingFace datasets loading
  • PR #405: Fixes for asyncio and timezone tests
  • PR #376: Edge case errors
  • PR #449: Fix failing settings tests