[None][chore] Fix failing KV Cache Transceiver Tests from #11574 by ekou24 · Pull Request #12554 · NVIDIA/TensorRT-LLM

ekou24 · 2026-03-26T07:23:15Z

Fixing failing KV Cache Transceiver tests (python + cpp) that were added in PR #11574

Summary by CodeRabbit

Tests
- Enhanced test coverage for distributed execution and resource management features
- Updated test validation to align with improved internal component structures
Chores
- Restructured internal test infrastructure for better maintainability and consistency

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Ethan Kou <ekou@eos0147.eos.clusters.nvidia.com>

ekou24 · 2026-03-26T07:23:45Z

/bot run --disable-fail-fast

coderabbitai · 2026-03-26T07:30:26Z

📝 Walkthrough

Walkthrough

This PR updates test files across the disaggregated module to reflect API changes: imports are adjusted to new module structure, slot allocation expectations shift from raw integers to AuxSlot objects, pool descriptor tests migrate to resource/page abstraction validation, and RankInfo tests simplify with reduced constructor parameters.

Changes

Cohort / File(s)	Summary
Test List Configuration `tests/integration/test_lists/test-db/l0_h100.yml`	Added seven new test targets to the `l0_h100` PyTorch/MPI test list under `unittest/disaggregated/region` and `unittest/disaggregated` modules.
Disaggregated Module Tests `tests/unittest/disaggregated/region/test_aux.py`, `tests/unittest/disaggregated/region/test_page.py`, `tests/unittest/disaggregated/test_rank_info.py`	Updated imports to match new module structure; modified slot allocation tests to use `AuxSlot` objects instead of raw integers; transitioned `test_page.py` from `PoolDescriptor` validation to resource/page abstraction serialization roundtrip checks; simplified `RankInfo` constructor expectations and removed deprecated fields like `kv_ptrs`, `aux_ptrs`, and device parameters.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	⚠️ Warning	The pull request description lacks substantive content and consists primarily of the template with empty sections.	Complete the Description section explaining what tests are being fixed and why; fill Test Coverage section with specific test references; provide concrete details beyond the PR title.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly identifies the main change as fixing failing KV Cache Transceiver Tests referenced in `#11574`, directly matching the changeset which updates test files and test lists.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ekou24 · 2026-03-26T07:53:06Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-26T07:58:43Z

PR_Github #40434 [ run ] triggered by Bot. Commit: 2542d70 Link to invocation

tensorrt-cicd · 2026-03-26T16:37:58Z

PR_Github #40434 [ run ] completed with state SUCCESS. Commit: 2542d70
/LLM/main/L0_MergeRequest_PR pipeline #31526 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

ekou24 · 2026-03-26T17:57:19Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-26T18:03:39Z

PR_Github #40463 [ run ] triggered by Bot. Commit: 2542d70 Link to invocation

tensorrt-cicd · 2026-03-26T20:17:36Z

PR_Github #40463 [ run ] completed with state FAILURE. Commit: 2542d70
/LLM/main/L0_MergeRequest_PR pipeline #31552 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

ekou24 · 2026-03-26T23:30:31Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-26T23:39:55Z

PR_Github #40467 [ run ] triggered by Bot. Commit: 2542d70 Link to invocation

tensorrt-cicd · 2026-03-27T01:21:59Z

PR_Github #40467 [ run ] completed with state SUCCESS. Commit: 2542d70
/LLM/main/L0_MergeRequest_PR pipeline #31556 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

ekou24 · 2026-03-27T19:54:18Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-03-27T20:00:48Z

PR_Github #40514 [ run ] triggered by Bot. Commit: 2542d70 Link to invocation

kv transceiver fix failing bugs

2542d70

Signed-off-by: Ethan Kou <ekou@eos0147.eos.clusters.nvidia.com>

github-actions bot assigned ekou24 Mar 26, 2026

ekou24 changed the title ~~[None][Chore] Fix failing KV Cache Transceiver Tests from #11574~~ [None][chore] Fix failing KV Cache Transceiver Tests from #11574 Mar 26, 2026

ekou24 requested a review from Shixiaowei02 March 27, 2026 19:54

Conversation

ekou24 commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

ekou24 commented Mar 26, 2026

Uh oh!

coderabbitai bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

ekou24 commented Mar 26, 2026

Uh oh!

tensorrt-cicd commented Mar 26, 2026

Uh oh!

tensorrt-cicd commented Mar 26, 2026

Uh oh!

ekou24 commented Mar 26, 2026

Uh oh!

tensorrt-cicd commented Mar 26, 2026

Uh oh!

tensorrt-cicd commented Mar 26, 2026

Uh oh!

ekou24 commented Mar 26, 2026

Uh oh!

tensorrt-cicd commented Mar 26, 2026

Uh oh!

tensorrt-cicd commented Mar 27, 2026

Uh oh!

ekou24 commented Mar 27, 2026

Uh oh!

tensorrt-cicd commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ekou24 commented Mar 26, 2026 •

edited

Loading

coderabbitai bot commented Mar 26, 2026 •

edited

Loading