-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Closed
Labels
Decoding/Sampling<NV>Token sampling algorithms in TRTLLM for text gen (top-k, top-p, beam).<NV>Token sampling algorithms in TRTLLM for text gen (top-k, top-p, beam).Pytorch<NV>Pytorch backend related issues<NV>Pytorch backend related issuesbugSomething isn't workingSomething isn't workingwaiting for feedback
Description
System Info
tensorrt-llm 0.20.0
Who can help?
@bobboli
When I run quickstart_advanced.py for qwen2&3, only modifing the sampling_params as following:
sampling_params = SamplingParams(
temperature=1.0,
top_p=0.7,
max_tokens=args.max_tokens,
n = 32,
)
Errors come like following figure. And setting "enable_trtllm_sampler=True " does not work.

Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
- change the sampling_params in quickstart_advanced.py to: sampling_params = SamplingParams(
temperature=1.0,
top_p=0.7,
max_tokens=args.max_tokens,
n = 32,
) - python3 quickstart_advanced.py --model_dir data_path_to_Qwen3-14B
Expected behavior
Setting adaptive SamplingParams works when the model is loaded from the torch backend.
actual behavior
additional notes
none
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Decoding/Sampling<NV>Token sampling algorithms in TRTLLM for text gen (top-k, top-p, beam).<NV>Token sampling algorithms in TRTLLM for text gen (top-k, top-p, beam).Pytorch<NV>Pytorch backend related issues<NV>Pytorch backend related issuesbugSomething isn't workingSomething isn't workingwaiting for feedback