Toolkit for automatic tuning and benchmarking of LLM serving configurations.
Warning
Still work in progress, you can expect failures. Please, check TODO.md for WIP.
This project requires inference-benchmarker. Install it using:
cargo install --git https://github.com/juanjucm/inference-benchmarker/First you need to setup your environment with uv.
uv venv --python 3.12
source .venv/bin/activateInstall dependencies with -e for dev mode:
uv pip install -e .This module provides a way to automatically detect the best LLM serving configuration that maximises throughput while being complient with a set of defined goodput criteria.
For running the script, make sure to provide a valid config yaml. Take a loot at auto-tune-config.yaml to check the format and expected parameters.
usage: uv run auto-tune [-h] [--config <config.yaml>] [--result-dir <result_dir>] [--dataset-id <dataset_id>] [--hf-token <hf_token>]
Auto-tune tool for finding optimal engine parameters.
options:
-h, --help show this help message and exit
--config Path to auto-tune configuration file
--result-dir (optional) Directory to save tuning results
--dataset-id (optional) Huggingface dataset where to dump results
--hf-token (optional) Huggingface token to use for accesing models and dataset.This tool allows for easily define and launch benchmarking scenarios for a set of defined LLM runtimes with specified parameters.
For running the script, make sure to provide a valid config yaml. Take a loot at bench_config.yaml to check the format and expected parameters.
usage: uv run multi-benchmarker [-h] [--config CONFIG] [--scenarios SCENARIOS] [--engines ENGINES] [--show-logs] [--save-dir SAVE_DIR]
Launch benchmarks based on a configuration file
options:
-h, --help show this help message and exit
--config CONFIG Path to benchmark configuration file
--scenarios SCENARIOS
Specific scenarios to run, comma separated (i.e: "s1,s2,s3") (if not specified, runs all scenarios)
--engines ENGINES Specific engines to test, comma separated (i.e: "e1,e2,e3") (if not specified, tests all engines)
--save-dir SAVE_DIR Directory to save benchmark results
--show-logs Show engine container logs.This tool launches a dashboard for visualizing benchmarking results.
Usage: dashboard [OPTIONS]
Options:
--from-results-dir TEXT Load inference-benchmarker results from a directory
--datasource TEXT Load a Parquet file already generated
--port INTEGER Port to run the dashboard
--help Show this message and exit.