Skip to content

Commit 098b9ff

Browse files
[NVIDIA#9147][feat] AutoDeploy: Draft Target Speculative Decoding (NVIDIA#9275)
Signed-off-by: Govind Ramnarayan <[email protected]>
1 parent a1964bc commit 098b9ff

File tree

9 files changed

+750
-197
lines changed

9 files changed

+750
-197
lines changed

tensorrt_llm/_torch/auto_deploy/llm_args.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,11 @@ class AutoDeployConfig(DynamicYamlMixInForSettings, BaseSettings):
185185
),
186186
)
187187

188+
draft_checkpoint_loader: Optional[object] = Field(
189+
default=None,
190+
description="The checkpoint loader to use for the draft model when using speculative decoding with two models.",
191+
)
192+
188193
### SEQUENCE INTERFACE CONFIG ##################################################################
189194
max_input_len: int = Field(default=1024, description="The maximum input length.")
190195
max_num_tokens: Optional[int] = Field(default=None, description="The maximum number of tokens.")

0 commit comments

Comments
 (0)