Skip to content

Conversation

cjluo-nv
Copy link
Collaborator

@cjluo-nv cjluo-nv commented Sep 29, 2025

What does this PR do?

Type of change: ? minor change

Overview: ?

For hf_ptq.py to work on phi4mm, we require user to modify the phi4_mm.py and set the default input method to language

otherwise the following error will occur:

  File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/utils/dataset_utils.py", line 284, in get_max_batch_size
    infer_method(sample_input_single_batch)
  File "/root/.cache/huggingface/modules/transformers_modules/Phi-4-multimodal-instruct/modeling_phi4mm.py", line 2101, in forward
    input_mode = InputMode(input_mode)
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/enum.py", line 757, in __call__
    return cls.__new__(cls, value)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/enum.py", line 1171, in __new__
    raise ve_exc
ValueError: None is not a valid InputMode

Summary by CodeRabbit

  • Bug Fixes
    • Added a clearer runtime warning for phi4mm models advising users to set InputMode.LANGUAGE before quantizing.
    • Warning now appears earlier (immediately after model-type detection) and the redundant later warning was removed to reduce noise.
    • No changes to quantization outputs or performance; behavior unchanged.

Signed-off-by: Chenjie Luo <[email protected]>
@cjluo-nv cjluo-nv requested a review from a team as a code owner September 29, 2025 15:38
@cjluo-nv cjluo-nv requested a review from meenchen September 29, 2025 15:38
Copy link

coderabbitai bot commented Sep 29, 2025

Walkthrough

A runtime warning for the phi4mm model type was relocated within examples/llm_ptq/hf_ptq.py. The warning now appears immediately after determining language-model extraction, advising to set InputMode.LANGUAGE before quantization. No functional control flow or behavior beyond warning timing was changed.

Changes

Cohort / File(s) Summary
PTQ warning timing update
examples/llm_ptq/hf_ptq.py
Moved the phi4mm-specific runtime warning to immediately follow language-model detection and removed its previous occurrence to avoid duplication. No logic or control-flow modifications beyond warning timing.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~7 minutes

Poem

A whisk of code, a gentle nudge,
I hop and shift a warning’s judge—
Not sooner, late, but just in place,
So quantizers keep a steady pace. 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title Check ❓ Inconclusive The title correctly identifies that a phi4_mm warning is being moved but the phrase “to above” lacks context about where the warning is being relocated, which may leave reviewers uncertain about the intended change location. Please clarify the target location and file in the title, for example: “Move phi4_mm InputMode warning above language-model extraction in hf_ptq.py” to make the change immediately understandable.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cjluo-nv-patch-1

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d98b37 and b25fdf1.

📒 Files selected for processing (1)
  • examples/llm_ptq/hf_ptq.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • examples/llm_ptq/hf_ptq.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: linux
  • GitHub Check: wait-checks / wait
  • GitHub Check: build-docs
  • GitHub Check: code-quality

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
examples/llm_ptq/hf_ptq.py (1)

331-335: Make phi4mm warning actionable with explicit error and stacklevel
Tighten the message to specify the calibration error and add stacklevel for clearer tracebacks.

 if model_type == "phi4mm":
-    warnings.warn(
-        "Please set the default input_mode to InputMode.LANGUAGE before quantizing."
-    )
+    warnings.warn(
+        "Phi4 multimodal model requires input_mode=InputMode.LANGUAGE (or calibration will raise ValueError: 'None is not a valid InputMode').",
+        stacklevel=2,
+    )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 615f3c0 and 2d98b37.

📒 Files selected for processing (1)
  • examples/llm_ptq/hf_ptq.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: linux
  • GitHub Check: wait-checks / wait
  • GitHub Check: code-quality
  • GitHub Check: build-docs

Copy link

codecov bot commented Sep 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.79%. Comparing base (70abfb4) to head (b25fdf1).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #389   +/-   ##
=======================================
  Coverage   73.79%   73.79%           
=======================================
  Files         171      171           
  Lines       17583    17583           
=======================================
  Hits        12975    12975           
  Misses       4608     4608           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cjluo-nv cjluo-nv enabled auto-merge (squash) September 30, 2025 17:22
@cjluo-nv cjluo-nv merged commit 17439e6 into main Sep 30, 2025
27 checks passed
@cjluo-nv cjluo-nv deleted the cjluo-nv-patch-1 branch September 30, 2025 18:24
kevalmorabia97 pushed a commit that referenced this pull request Oct 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants