Skip to content

GuideLLM v0.3.0

Latest
Compare
Choose a tag to compare
@markurtz markurtz released this 16 Sep 15:20
· 18 commits to main since this release
ad9513f

GuideLLM v0.3.0

Overview

A major (non-semantic versioning sense) release introducing the GuideLLM web UI, containerized benchmarking, dataset preprocessing, and significant workflow improvements. This release transitions the project from the Neural Magic organization into the vLLM project ecosystem while expanding benchmarking capabilities and improving developer experience.

To get started, install with:

pip install guidellm==0.3.0

Or from source with:

pip install git+https://github.com/vllm-project/[email protected]

What's New

  • GuideLLM Web UI: Complete frontend interface with interactive charts and data visualization for benchmark results
  • Dataset Preprocessing: New preprocess command to filter datasets by token distribution and save to local files or Hugging Face Hub
  • Containerized Benchmarking: Docker support with configurable environment variables for streamlined deployment
  • Benchmark Scenarios: Support for file-based benchmark configuration with Pydantic validation
  • HTML Report Generation: Static HTML reports with embedded visualization data

What's Changed

  • Project Migration: Transitioned from neuralmagic to vllm-project GitHub organization with updated links and branding
  • Improved Scheduling: Unified RPS and concurrent scheduler paths for better multi-turn conversation support
  • Enhanced OpenAI Backend: Added support for custom headers, SSL verification control, query parameters, and request body modifications
  • Development Workflow: Streamlined CI/CD with unified test execution, pre-commit improvements, and artifact management
  • Synthetic Data Generator: Added prefix caching controls and unique prompt generation

What's Fixed

  • Metric Calculation: Fixed double-counting issues in token calculations and concurrency change events
  • Event Loop Errors: Resolved "Event loop Closed" errors in HTTP client connection pooling
  • Token Counting: Fixed max token limits in synthetic data generator and first decode token counting
  • Display Issues: Corrected metric units display and Firefox compatibility for web UI

Compatibility Notes

  • Python: 3.9–3.13
  • OS: Linux and macOS
  • Dependencies: Updated to latest Pydantic, locked Click to support Python 3.9
  • Breaking: Removed several UI workflow components and husky pre-commit hooks
  • Breaking: Updated project URLs from vllm-project to neuralmagic organization

New Contributors

Changelog

Major Features

  • #169: Implement complete GuideLLM UI with interactive charts and Redux state management
  • #162: Add dataset preprocessing command with HuggingFace integration
  • #123: Add containerized benchmarking support with Docker configuration
  • #99: Add support for benchmark scenarios with Pydantic validation
  • #218: Implement HTML output generation with embedded data

Infrastructure & Workflows

  • #233: Unify RPS and concurrent scheduler paths for improved performance
  • #215: Complete UI build pipeline and GitHub Pages workflows
  • #231: Migrate project from vllm-project to neuralmagic organization
  • #190: Add container build jobs to all workflows

Backend Improvements

  • #230: Add CLI options for custom headers and SSL verification
  • #146: Allow extra query parameters for OpenAI server requests
  • #184: Add remove_from_body parameter to OpenAIHTTPBackend
  • #183: Add prefix caching controls to synthetic dataset generator

Bug Fixes & Quality

  • #266: Fix metric accumulation errors at extreme concurrency changes
  • #188: Fix "Event loop Closed" error in HTTP client pooling
  • #173: Fix double counting of tokens and warmup percentage calculation
  • #170: Fix max token limits in synthetic data generator

Developer Experience

  • #240: Add --version flag to guidellm CLI
  • #185: Add option to re-display benchmark files
  • #181: Improve pre-commit usability for local development
  • #239: Various tooling fixes including dependency groups and pylock