Skip to content

Setting adaptive SamplingParams is not allowed when the model is loaded from the torch backend. #5780

@ccys-a11y

Description

@ccys-a11y

System Info

tensorrt-llm 0.20.0

Who can help?

@bobboli
When I run quickstart_advanced.py for qwen2&3, only modifing the sampling_params as following:
sampling_params = SamplingParams(
temperature=1.0,
top_p=0.7,
max_tokens=args.max_tokens,
n = 32,
)
Errors come like following figure. And setting "enable_trtllm_sampler=True " does not work.
Image

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. change the sampling_params in quickstart_advanced.py to: sampling_params = SamplingParams(
    temperature=1.0,
    top_p=0.7,
    max_tokens=args.max_tokens,
    n = 32,
    )
  2. python3 quickstart_advanced.py --model_dir data_path_to_Qwen3-14B

Expected behavior

Setting adaptive SamplingParams works when the model is loaded from the torch backend.

actual behavior

Image

additional notes

none

Metadata

Metadata

Labels

Decoding/Sampling<NV>Token sampling algorithms in TRTLLM for text gen (top-k, top-p, beam).Pytorch<NV>Pytorch backend related issuesbugSomething isn't workingwaiting for feedback

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions