Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions unsloth_zoo/vllm_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1640,6 +1640,7 @@ def load_vllm(
gpu_memory_utilization : float = 0.8,
max_seq_length : int = 8192,
dtype : torch.dtype = None,
revision : str = None,
training : bool = True,
Comment on lines 1641 to 1644

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep positional-arg compatibility for load_vllm

Inserting revision between dtype and training changes the positional argument order, so any external callers using positional arguments after dtype will now pass their training boolean into revision, shifting the rest of the parameters. That silently alters behavior (e.g., training flips back to default True, float8_kv_cache stays False, etc.) and can cause wrong runtime settings. To avoid a backward-compat regression, add revision at the end or make the remaining parameters keyword-only.

Useful? React with 👍 / 👎.

float8_kv_cache : bool = False,
random_state : int = 0,
Expand Down Expand Up @@ -2013,6 +2014,7 @@ def load_vllm(

engine_args = dict(
model = model_name,
revision = revision,
gpu_memory_utilization = actual_gpu_memory_utilization,
max_model_len = max_seq_length,
quantization = "bitsandbytes" if use_bitsandbytes else None,
Expand Down