[TRTLLM-11385][chore] Mark TRTLLMSampler as deprecated and update documentation by Funatiq · Pull Request #11938 · NVIDIA/TensorRT-LLM

Funatiq · 2026-03-05T08:25:22Z

Summary by CodeRabbit

Documentation
- Updated sampling documentation, removing backend-specific content and guidance on sampler selection.
Deprecations
- TRTLLMSampler is deprecated and will be removed in release 1.4. TorchSampler is now the default and exclusive sampling backend for all configurations.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

ixlmar

LGTM

Funatiq · 2026-03-06T11:00:29Z

/bot run

tensorrt-cicd · 2026-03-06T11:05:59Z

PR_Github #38014 [ run ] triggered by Bot. Commit: b04bf0d Link to invocation

tensorrt-cicd · 2026-03-06T12:30:18Z

PR_Github #38014 [ run ] completed with state SUCCESS. Commit: b04bf0d
/LLM/main/L0_MergeRequest_PR pipeline #29445 completed with status: 'FAILURE'

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq · 2026-03-06T13:46:41Z

/bot run --stage-list "H100_PCIe-PyTorch-1, H100_PCIe-PyTorch-2, DGX_B200-4_GPUs-PyTorch-Ray-1"

tensorrt-cicd · 2026-03-06T13:52:21Z

PR_Github #38029 [ run ] triggered by Bot. Commit: 5a2cd58 Link to invocation

tensorrt-cicd · 2026-03-06T14:20:52Z

PR_Github #38029 [ run ] completed with state FAILURE. Commit: 5a2cd58
/LLM/main/L0_MergeRequest_PR pipeline #29456 (Partly Tested) completed with status: 'FAILURE'

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq · 2026-03-06T15:55:39Z

/bot run --stage-list "DGX_H100-PyTorch-1, DGX_H100-PyTorch-3, DGX_B200-4_GPUs-PyTorch-Ray-1"

tensorrt-cicd · 2026-03-06T16:01:59Z

PR_Github #38051 [ run ] triggered by Bot. Commit: 5a2cd58 Link to invocation

tensorrt-cicd · 2026-03-06T19:44:00Z

PR_Github #38051 [ run ] completed with state SUCCESS. Commit: 5a2cd58
/LLM/main/L0_MergeRequest_PR pipeline #29478 (Partly Tested) completed with status: 'SUCCESS'

Link to invocation

Funatiq · 2026-03-06T20:13:32Z

/bot run

coderabbitai · 2026-03-07T07:44:32Z

📝 Walkthrough

Walkthrough

These changes deprecate TRTLLMSampler functionality by removing automatic beam-search activation while retaining explicit invocation with deprecation warnings. Documentation is generalized to reduce sampler-specific references, and test code is updated to remove reliance on the deprecated sampler type.

Changes

Cohort / File(s)	Summary
Documentation `docs/source/features/sampling.md`	Removed backend-specific sampler references (Torch Sampler and TRTLLM Sampler) and examples. Generalized section title from "LLM API sampling behavior when using Torch Sampler" to "LLM API sampling behavior" and standardized "sampler" capitalization.
Core Library Changes `tensorrt_llm/llmapi/llm_args.py`, `tensorrt_llm/_torch/pyexecutor/_util.py`	Updated sampler_type field status from "beta" to "deprecated" with revised description. Removed conditional path that activated TRTLLMSampler for beam search; now only uses TRTLLMSampler when explicitly set, with deprecation warning logged.
Test Updates `tests/unittest/_torch/modeling/test_modeling_nemotron_h.py`, `tests/unittest/_torch/ray_orchestrator/multi_gpu/test_accuracy_with_allreduce_strategy.py`, `tests/unittest/api_stability/references/llm.yaml`	Removed explicit sampler_type argument from LLM construction and test parametrization. Updated test function signature to remove sampler_type parameter. Updated API stability reference metadata to reflect deprecated status.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	⚠️ Warning	The PR description is empty except for the repository template structure; no actual explanation, test coverage details, or checklist completion are provided by the author.	Add a clear description explaining why TRTLLMSampler is being deprecated, what the migration path is, and list relevant tests that validate the changes (e.g., tests that verify deprecation warnings and sampler behavior).

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title clearly and specifically describes the main change: marking TRTLLMSampler as deprecated and updating documentation, with proper JIRA ticket reference and chore type indicator.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/source/features/sampling.md`:
- Around line 84-96: The paragraph about FlashInfer should be restricted to the
default/TorchSampler path: update the text so it explicitly states that
FlashInfer optimizations and sorting-free implementations apply to TorchSampler
(or the sampler returned by _util.instantiate_sampler() when it yields a
TorchSampler) and not to other backends like TRTLLMSampler; mention that
TRTLLMSampler remains available until 1.4 and may not use FlashInfer, and keep
the note about disable_flashinfer_sampling scoped to TorchSampler behavior.
Locate references to _util.instantiate_sampler, TorchSampler, and TRTLLMSampler
and adjust wording so the FlashInfer performance notes only apply to
TorchSampler/default sampler.

In `@tensorrt_llm/llmapi/llm_args.py`:
- Around line 3014-3019: The Field declaration for sampler_type currently marks
the entire parameter deprecated via status="deprecated", which incorrectly flags
values like auto and TorchSampler as deprecated; remove the status="deprecated"
argument from the sampler_type Field so the parameter itself is not deprecated,
keep/update the description to explicitly state only SamplerType.TRTLLMSampler
(TRTLLMSampler) is deprecated and ensure any runtime warning continues to
trigger only when sampler_type == SamplerType.TRTLLMSampler; this involves
editing the sampler_type Field call and the description text around
SamplerType/TRTLLMSampler and leaving SamplerType.auto/TorchSampler behavior
unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8aeee868-37bc-4f11-a957-1208be1f3ebf

📥 Commits

Reviewing files that changed from the base of the PR and between 5b0c956 and 5a2cd58.

📒 Files selected for processing (6)

docs/source/features/sampling.md
tensorrt_llm/_torch/pyexecutor/_util.py
tensorrt_llm/llmapi/llm_args.py
tests/unittest/_torch/modeling/test_modeling_nemotron_h.py
tests/unittest/_torch/ray_orchestrator/multi_gpu/test_accuracy_with_allreduce_strategy.py
tests/unittest/api_stability/references/llm.yaml

💤 Files with no reviewable changes (1)

tests/unittest/_torch/modeling/test_modeling_nemotron_h.py

docs/source/features/sampling.md

tensorrt_llm/llmapi/llm_args.py

Funatiq · 2026-03-23T15:09:12Z

/bot run

tensorrt-cicd · 2026-03-23T15:15:25Z

PR_Github #39951 [ run ] triggered by Bot. Commit: b720871 Link to invocation

tensorrt-cicd · 2026-03-23T18:18:14Z

PR_Github #39951 [ run ] completed with state SUCCESS. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31116 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq · 2026-03-23T20:16:14Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-23T20:22:36Z

PR_Github #39974 [ run ] triggered by Bot. Commit: b720871 Link to invocation

tensorrt-cicd · 2026-03-24T03:15:44Z

PR_Github #39974 [ run ] completed with state SUCCESS. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31136 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq · 2026-03-24T07:51:24Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-24T07:57:03Z

PR_Github #40086 [ run ] triggered by Bot. Commit: b720871 Link to invocation

tensorrt-cicd · 2026-03-24T10:30:15Z

PR_Github #40086 [ run ] completed with state SUCCESS. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31238 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq · 2026-03-24T16:15:33Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-24T16:21:35Z

PR_Github #40148 [ run ] triggered by Bot. Commit: b720871 Link to invocation

tensorrt-cicd · 2026-03-24T18:18:24Z

PR_Github #40148 [ run ] completed with state FAILURE. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31294 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq · 2026-03-24T19:08:46Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-24T19:15:43Z

PR_Github #40161 [ run ] triggered by Bot. Commit: b720871 Link to invocation

tensorrt-cicd · 2026-03-24T22:19:22Z

PR_Github #40161 [ run ] completed with state SUCCESS. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31305 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq · 2026-03-25T09:11:44Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-25T09:18:02Z

PR_Github #40308 [ run ] triggered by Bot. Commit: b720871 Link to invocation

tensorrt-cicd · 2026-03-25T16:41:51Z

PR_Github #40308 [ run ] completed with state SUCCESS. Commit: b720871
/LLM/main/L0_MergeRequest_PR pipeline #31419 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

Funatiq · 2026-03-27T15:46:32Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-27T15:52:22Z

PR_Github #40509 [ run ] triggered by Bot. Commit: 391d683 Link to invocation

tensorrt-cicd · 2026-03-27T23:03:05Z

PR_Github #40509 [ run ] completed with state SUCCESS. Commit: 391d683
/LLM/main/L0_MergeRequest_PR pipeline #31597 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq · 2026-03-30T06:50:25Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-30T06:56:04Z

PR_Github #40685 [ run ] triggered by Bot. Commit: 391d683 Link to invocation

tensorrt-cicd · 2026-03-30T12:12:12Z

PR_Github #40685 [ run ] completed with state SUCCESS. Commit: 391d683
/LLM/main/L0_MergeRequest_PR pipeline #31714 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq · 2026-03-30T13:11:07Z

/bot run --disable-fail-fast

github-actions bot assigned Funatiq Mar 5, 2026

ixlmar approved these changes Mar 6, 2026

View reviewed changes

Funatiq force-pushed the dev/deprecate_trtllm_sampler branch from b04bf0d to 5a2cd58 Compare March 6, 2026 13:45

Funatiq marked this pull request as ready for review March 7, 2026 07:38

Funatiq requested review from a team as code owners March 7, 2026 07:38

Funatiq requested review from Superjomn, Wanli-Jiang, chang-l, danielafrimi, joyang-nv, kaiyux and lucaslie March 7, 2026 07:38

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

docs/source/features/sampling.md Outdated Show resolved Hide resolved

tensorrt_llm/llmapi/llm_args.py Outdated Show resolved Hide resolved

Funatiq force-pushed the dev/deprecate_trtllm_sampler branch from 5a2cd58 to 06c1c8f Compare March 9, 2026 14:27

Funatiq marked this pull request as ready for review March 25, 2026 09:07

achartier approved these changes Mar 26, 2026

View reviewed changes

Funatiq added 2 commits March 27, 2026 16:46

[chore] Mark TRTLLMSampler as deprecated

d530c85

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

[docs] Update documentation for TRTLLMSampler deprecation

391d683

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

Funatiq force-pushed the dev/deprecate_trtllm_sampler branch from b720871 to 391d683 Compare March 27, 2026 15:46

Conversation

Funatiq commented Mar 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

ixlmar left a comment

Choose a reason for hiding this comment

Uh oh!

Funatiq commented Mar 6, 2026

Uh oh!

tensorrt-cicd commented Mar 6, 2026

Uh oh!

tensorrt-cicd commented Mar 6, 2026

Uh oh!

Funatiq commented Mar 6, 2026

Uh oh!

tensorrt-cicd commented Mar 6, 2026

Uh oh!

tensorrt-cicd commented Mar 6, 2026

Uh oh!

Funatiq commented Mar 6, 2026

Uh oh!

tensorrt-cicd commented Mar 6, 2026

Uh oh!

tensorrt-cicd commented Mar 6, 2026

Uh oh!

Funatiq commented Mar 6, 2026

Uh oh!

coderabbitai bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Funatiq commented Mar 23, 2026

Uh oh!

tensorrt-cicd commented Mar 23, 2026

Uh oh!

tensorrt-cicd commented Mar 23, 2026

Uh oh!

Funatiq commented Mar 23, 2026

Uh oh!

tensorrt-cicd commented Mar 23, 2026

Uh oh!

tensorrt-cicd commented Mar 24, 2026

Uh oh!

Funatiq commented Mar 24, 2026

Uh oh!

tensorrt-cicd commented Mar 24, 2026

Uh oh!

tensorrt-cicd commented Mar 24, 2026

Uh oh!

Funatiq commented Mar 24, 2026

Uh oh!

tensorrt-cicd commented Mar 24, 2026

Uh oh!

tensorrt-cicd commented Mar 24, 2026

Uh oh!

Funatiq commented Mar 24, 2026

Uh oh!

tensorrt-cicd commented Mar 24, 2026

Uh oh!

tensorrt-cicd commented Mar 24, 2026

Uh oh!

Funatiq commented Mar 25, 2026

Uh oh!

tensorrt-cicd commented Mar 25, 2026

Uh oh!

tensorrt-cicd commented Mar 25, 2026

Uh oh!

Funatiq commented Mar 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 7, 2026 •

edited

Loading