vllm-project
diff --git a/‎.github/ISSUE_TEMPLATE/bug_report.md‎
Lines changed: 27 additions & 0 deletions b/‎.github/ISSUE_TEMPLATE/bug_report.md‎
Lines changed: 27 additions & 0 deletions
diff --git a/‎.github/ISSUE_TEMPLATE/doc-edit.md‎
Lines changed: 18 additions & 0 deletions b/‎.github/ISSUE_TEMPLATE/doc-edit.md‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎.github/ISSUE_TEMPLATE/feature_request.md‎
Lines changed: 18 additions & 0 deletions b/‎.github/ISSUE_TEMPLATE/feature_request.md‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎.github/PULL_REQUEST_TEMPLATE.md‎
Lines changed: 38 additions & 0 deletions b/‎.github/PULL_REQUEST_TEMPLATE.md‎
Lines changed: 38 additions & 0 deletions
diff --git a/‎.github/workflows/development.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/development.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/main.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/main.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.md‎
Lines changed: 28 additions & 9 deletions b/‎README.md‎
Lines changed: 28 additions & 9 deletions
diff --git a/‎deploy/Containerfile‎
Lines changed: 16 additions & 22 deletions b/‎deploy/Containerfile‎
Lines changed: 16 additions & 22 deletions
diff --git a/‎deploy/entrypoint.sh‎
Lines changed: 0 additions & 43 deletions b/‎deploy/entrypoint.sh‎
Lines changed: 0 additions & 43 deletions
diff --git a/‎docs/guides/cli.md‎
Lines changed: 36 additions & 1 deletion b/‎docs/guides/cli.md‎
Lines changed: 36 additions & 1 deletion
@@ -0,0 +1,27 @@
+---
+name: Bug report
+about: Create a report to help us improve
+labels: bug
+
+---
+
+**Describe the bug**
+A clear and concise description of what the bug is.
+
+**Expected behavior**
+A clear and concise description of what you expected to happen.
+
+**Environment**
+Include all relevant environment information:
+1. OS [e.g. Ubuntu 20.04]:
+2. Python version [e.g. 3.12.2]:
+
+**To Reproduce**
+Exact steps to reproduce the behavior:
+
+
+**Errors**
+If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.
+
+**Additional context**
+Add any other context about the problem here. Also include any relevant files.
@@ -0,0 +1,18 @@
+---
+name: Doc edit
+about: Propose changes to project documentation
+labels: documentation
+
+---
+
+**What is the URL, file, or UI containing proposed doc change**
+Where does one find the original content or where would this change go?
+
+**What is the current content or situation in question**
+Copy/paste the source content or describe gap.
+
+**What is the proposed change**
+Add new content.
+
+**Additional context**
+Add any other context about the change here. Also include any relevant files or URLs.
@@ -0,0 +1,18 @@
+---
+name: Feature request
+about: Suggest an idea for this project
+labels: enhancement
+
+---
+
+**Is your feature request related to a problem? Please describe.**
+A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
+
+**Describe the solution you'd like**
+A clear and concise description of what you want to happen.
+
+**Describe alternatives you've considered**
+A clear and concise description of any alternative solutions or features you've considered.
+
+**Additional context**
+Add any other context or screenshots about the feature request here.
@@ -0,0 +1,38 @@
+## Summary
+
+<!--
+Include a short paragraph of the changes introduced in this PR.
+If this PR requires additional context or rationale, explain why
+the changes are necessary.
+-->
+
+## Details
+
+<!--
+Provide a detailed list of all changes introduced in this pull request.
+-->
+- [ ]
+
+## Test Plan
+
+<!--
+List the steps needed to test this PR.
+-->
+-
+
+## Related Issues
+
+<!--
+Link any relevant issues that this PR addresses.
+-->
+- Resolves #
+
+---
+
+- [ ] "I certify that all code in this PR is my own, except as noted below."
+
+## Use of AI
+
+- [ ] Includes AI-assisted code completion
+- [ ] Includes code generated by an AI application
+- [ ] Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes `## WRITTEN BY AI ##`)
@@ -124,7 +124,7 @@ jobs:
       - name: Install dependencies
         run: pip install tox
       - name: Run unit tests
-        run: tox -e test-unit -- -m "smoke or sanity"
+        run: tox -e test-unit
 
   ui-unit-tests:
     permissions:
 
@@ -125,7 +125,7 @@ jobs:
       - name: Install dependencies
         run: pip install tox
       - name: Run unit tests
-        run: tox -e test-unit -- -m "smoke or sanity"
+        run: tox -e test-unit
 
   ui-unit-tests:
     permissions:
 
@@ -52,6 +52,25 @@ pip install git+https://github.com/vllm-project/guidellm.git
 
 For detailed installation instructions and requirements, see the [Installation Guide](https://github.com/vllm-project/guidellm/blob/main/docs/install.md).
 
+### With Podman / Docker
+
+Alternatively we publish container images at [ghcr.io/vllm-project/guidellm](https://github.com/vllm-project/guidellm/pkgs/container/guidellm). Running a container is (by default) equivalent to `guidellm benchmark run`:
+
+```bash
+podman run \
+  --rm -it \
+  -v "./results:/results:rw" \
+  -e GUIDELLM_TARGET=http://localhost:8000 \
+  -e GUIDELLM_RATE_TYPE=sweep \
+  -e GUIDELLM_MAX_SECONDS=30 \
+  -e GUIDELLM_DATA="prompt_tokens=256,output_tokens=128" \
+  ghcr.io/vllm-project/guidellm:latest
+```
+
+> [!TIP] CLI options can also be specified as ENV variables (E.g. `--rate-type sweep` -> `GUIDELLM_RATE_TYPE=sweep`). If both are specified then the CLI option overrides the the ENV.
+
+Replace `latest` with `stable` for the newest tagged release or set a specific release if desired.
+
 ### Quick Start
 
 #### 1. Start an OpenAI Compatible Server (vLLM)
@@ -159,11 +178,19 @@ GuideLLM UI is a companion frontend for visualizing the results of a GuideLLM be
 
 ### 🛠 Generating an HTML report with a benchmark run
 
+For either pathway below you'll need to set the output path to benchmarks.html for your run:
+
+```bash
+--output-path=benchmarks.html
+```
+
+Alternatively load a saved run using the from-file command and also set the output to benchmarks.html
+
 1. Use the Hosted Build (Recommended for Most Users)
 
 This is preconfigured. The latest stable version of the hosted UI (https://blog.vllm.ai/guidellm/ui/latest) will be used to build the local html file.
 
-Open benchmarks.html in your browser and you're done—no setup required.
+Execute your run, then open benchmarks.html in your browser and you're done—no further setup required.
 
 2. Build and Serve the UI Locally (For Development) This option is useful if:
 
@@ -187,14 +214,6 @@ export GUIDELLM__ENV=local
 
 Then you can execute your run.
 
-Set the output to benchmarks.html for your run:
-
-```bash
---output-path=benchmarks.html
-```
-
-Alternatively load a saved run using the from-file command and also set the output to benchmarks.html
-
 ## Resources
 
 ### Documentation
 
@@ -1,26 +1,26 @@
-ARG PYTHON=3.13
+ARG BASE_IMAGE=docker.io/python:3.13-slim
 
 # Use a multi-stage build to create a lightweight production image
-FROM docker.io/python:${PYTHON}-slim as builder
+FROM $BASE_IMAGE as builder
+
+# Ensure files are installed as root
+USER root
 
 # Copy repository files
-COPY / /src
+COPY / /opt/app-root/src
 
 # Create a venv and install guidellm
-RUN python3 -m venv /opt/guidellm \
-    && /opt/guidellm/bin/pip install --no-cache-dir /src
-
-# Copy entrypoint script into the venv bin directory
-RUN install -m0755 /src/deploy/entrypoint.sh /opt/guidellm/bin/entrypoint.sh
+RUN python3 -m venv /opt/app-root/guidellm \
+    && /opt/app-root/guidellm/bin/pip install --no-cache-dir /opt/app-root/src
 
 # Prod image
-FROM docker.io/python:${PYTHON}-slim
+FROM $BASE_IMAGE
 
 # Copy the virtual environment from the builder stage
-COPY --from=builder /opt/guidellm /opt/guidellm
+COPY --from=builder /opt/app-root/guidellm /opt/app-root/guidellm
 
 # Add guidellm bin to PATH
-ENV PATH="/opt/guidellm/bin:$PATH"
+ENV PATH="/opt/app-root/guidellm/bin:$PATH"
 
 # Create a non-root user
 RUN useradd -md /results guidellm
@@ -35,14 +35,8 @@ WORKDIR /results
 LABEL org.opencontainers.image.source="https://github.com/vllm-project/guidellm" \
       org.opencontainers.image.description="GuideLLM Performance Benchmarking Container"
 
-# Set the environment variable for the benchmark script
-# TODO: Replace with scenario environment variables
-ENV GUIDELLM_TARGET="http://localhost:8000" \
-    GUIDELLM_MODEL="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16" \
-    GUIDELLM_RATE_TYPE="sweep" \
-    GUIDELLM_DATA="prompt_tokens=256,output_tokens=128" \
-    GUIDELLM_MAX_REQUESTS="100" \
-    GUIDELLM_MAX_SECONDS="" \
-    GUIDELLM_OUTPUT_PATH="/results/results.json"
-
-ENTRYPOINT [ "/opt/guidellm/bin/entrypoint.sh" ]
+# Argument defaults can be set with GUIDELLM_<ARG>
+ENV GUIDELLM_OUTPUT_PATH="/results/benchmarks.json"
+
+ENTRYPOINT [ "/opt/app-root/guidellm/bin/guidellm" ]
+CMD [ "benchmark", "run" ]
@@ -1 +1,36 @@
-# Coming Soon
+# CLI Reference
+
+This page provides a reference for the `guidellm` command-line interface. For more advanced configuration, including environment variables and `.env` files, see the [Configuration Guide](./configuration.md).
+
+## `guidellm benchmark run`
+
+This command is the primary entrypoint for running benchmarks. It has many options that can be specified on the command line or in a scenario file.
+
+### Scenario Configuration
+
+| Option                      | Description                                                                                                                                     |
+| --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
+| `--scenario <PATH or NAME>` | The name of a builtin scenario or path to a scenario configuration file. Options specified on the command line will override the scenario file. |
+
+### Target and Backend Configuration
+
+These options configure how `guidellm` connects to the system under test.
+
+| Option                  | Description                                                                                                                                                                                                   |
+| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `--target <URL>`        | **Required.** The endpoint of the target system, e.g., `http://localhost:8080`. Can also be set with the `GUIDELLM__OPENAI__BASE_URL` environment variable.                                                   |
+| `--backend-type <TYPE>` | The type of backend to use. Defaults to `openai_http`.                                                                                                                                                        |
+| `--backend-args <JSON>` | A JSON string for backend-specific arguments. For example: `--backend-args '{"headers": {"Authorization": "Bearer my-token"}, "verify": false}'` to pass custom headers and disable certificate verification. |
+| `--model <NAME>`        | The ID of the model to benchmark within the backend.                                                                                                                                                          |
+
+### Data and Request Configuration
+
+These options define the data to be used for benchmarking and how requests will be generated.
+
+| Option                    | Description                                                                                                                                                                              |
+| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `--data <SOURCE>`         | The data source. This can be a HuggingFace dataset ID, a path to a local data file, or a synthetic data configuration. See the [Data Formats Guide](./data_formats.md) for more details. |
+| `--rate-type <TYPE>`      | The type of request generation strategy to use (e.g., `constant`, `poisson`, `sweep`).                                                                                                   |
+| `--rate <NUMBER>`         | The rate of requests per second for `constant` or `poisson` strategies, or the number of steps for a `sweep`.                                                                            |
+| `--max-requests <NUMBER>` | The maximum number of requests to run for each benchmark.                                                                                                                                |
+| `--max-seconds <NUMBER>`  | The maximum number of seconds to run each benchmark for.                                                                                                                                 |