Skip to content

Conversation

@yechank-nvidia
Copy link
Collaborator

@yechank-nvidia yechank-nvidia commented Nov 4, 2025

Summary by CodeRabbit

Release Notes

  • Refactor
    • Updated input processor initialization patterns to support more flexible configuration handling across multiple model variants.
    • Enhanced constructor interfaces to better align with parent class requirements while maintaining backward compatibility.

Signed-off-by: yechank <[email protected]>
@yechank-nvidia yechank-nvidia requested review from a team as code owners November 4, 2025 16:48
@yechank-nvidia yechank-nvidia self-assigned this Nov 4, 2025
@yechank-nvidia yechank-nvidia added the Multimodal Label for issues & PRs regarding Multimodal related objects label Nov 4, 2025
@yechank-nvidia
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #23526 [ run ] triggered by Bot. Commit: 7945cc5

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 4, 2025

📝 Walkthrough

Walkthrough

This PR updates input processor classes across multiple model implementations to accept and forward arbitrary keyword arguments (**kwargs) to their parent classes during initialization, enabling more flexible constructor configuration at the protocol and implementation levels.

Changes

Cohort / File(s) Summary
Model-specific input processors
tensorrt_llm/_torch/models/modeling_gemma3vl.py, modeling_hyperclovax.py, modeling_llama.py, modeling_llava_next.py, modeling_mistral.py, modeling_nanov2vlm.py, modeling_phi4mm.py, modeling_qwen2vl.py
Updated __init__ method signatures for all input processor classes to accept **kwargs and forward them to the superclass via super().__init__(**kwargs) instead of calling super().__init__() with no arguments. Enables additional initialization parameters to be propagated to parent classes.
Protocol definition
tensorrt_llm/inputs/registry.py
Added __init__ method to the InputProcessor protocol with **kwargs parameter and superclass forwarding, aligning protocol expectations with concrete implementation signatures.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

  • Rationale: Changes follow an identical repetitive pattern across all files (adding **kwargs to signatures and updating super() calls). While 9 files are affected, the homogeneity of modifications reduces cognitive load per file.
  • Areas requiring attention:
    • Verify that parent/superclasses of all InputProcessor implementations accept **kwargs in their __init__ methods
    • Confirm the protocol change in registry.py is compatible with all concrete implementations throughout the codebase
    • Check for any external code or tests that directly instantiate these input processors and may be affected by the signature changes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request lacks a description section required by the template, with no explanation of the issue, solution, or test coverage provided. Add a detailed description explaining the problem being solved, why the **kwargs changes are needed for the dummy builder, and list relevant test coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies a fix for the multimodal InputProcessor dummy builder, directly corresponding to the changes made across multiple input processor implementations.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (16)
tensorrt_llm/inputs/registry.py (2)

1-1: Add NVIDIA Apache-2.0 header (2025).

Per guidelines, prepend the NVIDIA Apache-2.0 copyright header to this source file.


350-357: Pass a real SamplingParams to call (None will crash).

get_dummy_prompt() calls self(..., None), but implementations access fields on sampling_params. Pass a minimal instance.

Apply this diff:

-            prompt_token_ids_single_img, _ = self(test_mm_prompt, None)
+            # Use minimal sampling params; callers can override as needed.
+            prompt_token_ids_single_img, _ = self(
+                test_mm_prompt,
+                SamplingParams(),
+            )
tensorrt_llm/_torch/models/modeling_llama.py (2)

1-1: Add NVIDIA Apache-2.0 header (2025).


1-20: Python 3.8 annotations compatibility.

This file contains PEP 585 generics (e.g., tuple[...], list[...]) in many signatures. Add from __future__ import annotations at top or switch to typing.* to meet 3.8+ requirement.

tensorrt_llm/_torch/models/modeling_mistral.py (2)

1-1: Add NVIDIA Apache-2.0 header (2025).


1-40: Python 3.8 annotations compatibility.

Consider adding from __future__ import annotations to support list[...], tuple[...] usages.

tensorrt_llm/_torch/models/modeling_gemma3vl.py (2)

1-1: Add NVIDIA Apache-2.0 header (2025).


1-30: Python 3.8 annotations compatibility.

Add from __future__ import annotations to handle built-in generics.

tensorrt_llm/_torch/models/modeling_hyperclovax.py (2)

1-1: Add NVIDIA Apache-2.0 header (2025).


1-40: Python 3.8 annotations compatibility.

Add from __future__ import annotations to cover list[...] etc.

tensorrt_llm/_torch/models/modeling_llava_next.py (2)

1-1: Add NVIDIA Apache-2.0 header (2025).


416-422: Python 3.8 type annotation fix for _pad_for_batching.

list[torch.Tensor] requires 3.9+. For 3.8, add from __future__ import annotations at file top or switch to List[torch.Tensor].

tensorrt_llm/_torch/models/modeling_nanov2vlm.py (2)

1-3: Replace header with NVIDIA Apache-2.0 (2025).

Current line 1 is a different notice. Per guidelines, use the standard NVIDIA Apache-2.0 header.


1-50: Python 3.8 annotations compatibility.

Add from __future__ import annotations or use typing.* forms to support 3.8.

tensorrt_llm/_torch/models/modeling_qwen2vl.py (2)

1-1: Add NVIDIA Apache-2.0 header (2025).


1-60: Python 3.8 annotations compatibility.

This file heavily uses built-in generics in annotations. Add from __future__ import annotations or switch to typing.*.

🧹 Nitpick comments (3)
tensorrt_llm/inputs/registry.py (3)

41-44: Use typing.Any, not bare any, in annotations.

any is a function, not a type. Replace with typing.Any for clarity and type-checking.

Apply this diff:

-    model_path: any
-    config: any
-    tokenizer: any
+    model_path: Any
+    config: Any
+    tokenizer: Any

95-105: Avoid bare except; at least catch Exception.

Catching everything hides real bugs. Narrow if possible; otherwise use except Exception as e.

Apply this diff:

-            except:
+            except Exception:
@@
-                except:
+                except Exception:

Also applies to: 109-117


609-617: Docstring param name mismatch (nit).

Docstring refers to original_processor, but the real parameter is input_processor. Align to avoid confusion.

-    Args:
-        original_processor: The original input processor to wrap.
+    Args:
+        input_processor: The original input processor to wrap.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e2b2675 and 7945cc5.

📒 Files selected for processing (9)
  • tensorrt_llm/_torch/models/modeling_gemma3vl.py (1 hunks)
  • tensorrt_llm/_torch/models/modeling_hyperclovax.py (1 hunks)
  • tensorrt_llm/_torch/models/modeling_llama.py (1 hunks)
  • tensorrt_llm/_torch/models/modeling_llava_next.py (1 hunks)
  • tensorrt_llm/_torch/models/modeling_mistral.py (1 hunks)
  • tensorrt_llm/_torch/models/modeling_nanov2vlm.py (1 hunks)
  • tensorrt_llm/_torch/models/modeling_phi4mm.py (1 hunks)
  • tensorrt_llm/_torch/models/modeling_qwen2vl.py (1 hunks)
  • tensorrt_llm/inputs/registry.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Use only spaces, no tabs; indent with 4 spaces.

Files:

  • tensorrt_llm/inputs/registry.py
  • tensorrt_llm/_torch/models/modeling_qwen2vl.py
  • tensorrt_llm/_torch/models/modeling_mistral.py
  • tensorrt_llm/_torch/models/modeling_llava_next.py
  • tensorrt_llm/_torch/models/modeling_llama.py
  • tensorrt_llm/_torch/models/modeling_nanov2vlm.py
  • tensorrt_llm/_torch/models/modeling_hyperclovax.py
  • tensorrt_llm/_torch/models/modeling_phi4mm.py
  • tensorrt_llm/_torch/models/modeling_gemma3vl.py
**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Python code must target Python 3.8+.
Indent Python code with 4 spaces; do not use tabs.
Maintain module namespace when importing; prefer 'from package.subpackage import foo' then 'foo.SomeClass()' instead of importing the class directly.
Python filenames should be snake_case (e.g., some_file.py).
Python classes use PascalCase names.
Functions and methods use snake_case names.
Local variables use snake_case; prefix 'k' for variables that start with a number (e.g., k_99th_percentile).
Global variables use upper SNAKE_CASE prefixed with 'G' (e.g., G_MY_GLOBAL).
Constants use upper SNAKE_CASE (e.g., MY_CONSTANT).
Avoid shadowing variables from an outer scope.
Initialize all externally visible members of a class in the constructor.
Prefer docstrings for interfaces that may be used outside a file; comments for in-function or file-local interfaces.
Use Google-style docstrings for classes and functions (Sphinx-parsable).
Document attributes and variables inline so they render under the class/function docstring.
Avoid reflection when a simpler, explicit approach suffices (e.g., avoid dict(**locals()) patterns).
In try/except, catch the most specific exceptions possible.
For duck-typing try/except, keep the try body minimal and use else for the main logic.

Files:

  • tensorrt_llm/inputs/registry.py
  • tensorrt_llm/_torch/models/modeling_qwen2vl.py
  • tensorrt_llm/_torch/models/modeling_mistral.py
  • tensorrt_llm/_torch/models/modeling_llava_next.py
  • tensorrt_llm/_torch/models/modeling_llama.py
  • tensorrt_llm/_torch/models/modeling_nanov2vlm.py
  • tensorrt_llm/_torch/models/modeling_hyperclovax.py
  • tensorrt_llm/_torch/models/modeling_phi4mm.py
  • tensorrt_llm/_torch/models/modeling_gemma3vl.py
**/*.{cpp,cxx,cc,h,hpp,hh,hxx,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Prepend the NVIDIA Apache-2.0 copyright header with current year to the top of all source files (e.g., .cpp, .h, .cu, .py).

Files:

  • tensorrt_llm/inputs/registry.py
  • tensorrt_llm/_torch/models/modeling_qwen2vl.py
  • tensorrt_llm/_torch/models/modeling_mistral.py
  • tensorrt_llm/_torch/models/modeling_llava_next.py
  • tensorrt_llm/_torch/models/modeling_llama.py
  • tensorrt_llm/_torch/models/modeling_nanov2vlm.py
  • tensorrt_llm/_torch/models/modeling_hyperclovax.py
  • tensorrt_llm/_torch/models/modeling_phi4mm.py
  • tensorrt_llm/_torch/models/modeling_gemma3vl.py
🧠 Learnings (1)
📚 Learning: 2025-08-26T09:37:10.463Z
Learnt from: jiaganc
Repo: NVIDIA/TensorRT-LLM PR: 7031
File: tensorrt_llm/bench/dataclasses/configuration.py:90-104
Timestamp: 2025-08-26T09:37:10.463Z
Learning: In TensorRT-LLM, the `get_pytorch_perf_config()` method returns `self.pytorch_config` which can contain default `cuda_graph_config` values, so `llm_args` may already have this config before the extra options processing.

Applied to files:

  • tensorrt_llm/_torch/models/modeling_llava_next.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (10)
tensorrt_llm/inputs/registry.py (2)

133-137: Constructor chaining is fine once Protocol fix lands.

The super().__init__(**kwargs) pattern across mixins will work as intended after the Protocol swallows kwargs. No further changes needed here.

Please confirm no other base classes in the MRO implement __init__ with a fixed signature that would still receive kwargs.

Also applies to: 308-313


1-12: File registry.py is already Python 3.8 compatible; no changes needed.

The file uses typing.List, typing.Dict, and typing.Tuple throughout, which are fully compatible with Python 3.8+. The review comment incorrectly claims the module uses PEP 585 generics (e.g., list[], dict[]), but a scan of the file confirms it does not. While other modules in the codebase use PEP 585 syntax (171 occurrences across the project), registry.py is not among them and requires no modifications.

Likely an incorrect or invalid review comment.

tensorrt_llm/_torch/models/modeling_llama.py (1)

1056-1059: Kwargs plumbing looks good; depends on Protocol fix.

Forwarding **kwargs and using super().__init__(**kwargs) enables future options (e.g., use_fast). Safe once InputProcessor.__init__ swallows kwargs.

Please confirm that call sites (e.g., create_input_processor) pass no unexpected kwargs today.

tensorrt_llm/_torch/models/modeling_mistral.py (1)

227-230: Kwargs passthrough approved.

super().__init__(**kwargs) aligns with the new protocol. No further changes needed here.

tensorrt_llm/_torch/models/modeling_gemma3vl.py (1)

45-48: Kwargs passthrough approved.

super().__init__(**kwargs) is correct and consistent with the registry.

tensorrt_llm/_torch/models/modeling_hyperclovax.py (1)

575-578: Kwargs passthrough approved.

Matches the cross-file pattern; OK after Protocol fix.

tensorrt_llm/_torch/models/modeling_llava_next.py (1)

46-49: Kwargs passthrough approved.

This unblocks future options via kwargs; relies on Protocol fix.

tensorrt_llm/_torch/models/modeling_nanov2vlm.py (1)

265-268: Kwargs passthrough approved.

OK, consistent with the new protocol.

tensorrt_llm/_torch/models/modeling_qwen2vl.py (1)

99-101: Kwargs passthrough approved.

super().__init__(**kwargs) is correct given the Protocol fix.

tensorrt_llm/_torch/models/modeling_phi4mm.py (1)

766-768: LGTM! Constructor signature updated to support flexible base class initialization.

The addition of **kwargs and forwarding to the base class via super().__init__(**kwargs) enables more flexible constructor configuration while maintaining backward compatibility with the existing trust_remote_code parameter. The implementation is clean and aligns with the protocol changes across the PR.

Comment on lines 269 to 270
raise ValueError("trust_remote_code must be True for Phi4MM")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Error message references wrong model (nit).

Message says “Phi4MM” in NanoV2VL input processor. Update for accuracy.

Apply this diff:

-            raise ValueError("trust_remote_code must be True for Phi4MM")
+            raise ValueError("trust_remote_code must be True for NanoV2VL")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
raise ValueError("trust_remote_code must be True for Phi4MM")
raise ValueError("trust_remote_code must be True for NanoV2VL")
🧰 Tools
🪛 Ruff (0.14.3)

269-269: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
In tensorrt_llm/_torch/models/modeling_nanov2vlm.py around lines 269 to 270, the
ValueError message incorrectly references "Phi4MM"; update the exception text to
accurately name the NanoV2VL model/input processor (for example: raise
ValueError("trust_remote_code must be True for NanoV2VL")). Replace the
erroneous string only, keeping the raise statement and logic unchanged.

Comment on lines +45 to +47
def __init__(self, **kwargs):
super().__init__(**kwargs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix Protocol init to avoid super() in MRO (TypeError risk).

InputProcessor is in the MRO of concrete processors; calling super().init(**kwargs) here can bubble kwargs to ABC/object and crash. Swallow kwargs instead.

Apply this diff:

 class InputProcessor(Protocol):
@@
-    def __init__(self, **kwargs):
-        super().__init__(**kwargs)
+    def __init__(self, **kwargs) -> None:
+        # Protocol initializer intentionally swallows kwargs to terminate
+        # cooperative __init__ chains safely.
+        pass
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def __init__(self, **kwargs):
super().__init__(**kwargs)
def __init__(self, **kwargs) -> None:
# Protocol initializer intentionally swallows kwargs to terminate
# cooperative __init__ chains safely.
pass
🤖 Prompt for AI Agents
In tensorrt_llm/inputs/registry.py around lines 45 to 47, the
InputProcessor.__init__ currently calls super().__init__(**kwargs) which can
bubble unexpected kwargs up the MRO and raise a TypeError; change the
initializer to accept **kwargs but not call super — simply swallow/ignore the
kwargs (or explicitly pop any known args) so no unexpected keyword arguments are
passed to parent classes. Ensure the method signature remains def __init__(self,
**kwargs): and its body does nothing with kwargs (or documents ignored params).

@tensorrt-cicd
Copy link
Collaborator

PR_Github #23526 [ run ] completed with state SUCCESS. Commit: 7945cc5
/LLM/main/L0_MergeRequest_PR pipeline #17706 completed with status: 'FAILURE'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Multimodal Label for issues & PRs regarding Multimodal related objects

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants