-
Notifications
You must be signed in to change notification settings - Fork 169
[Bug fix 5528642] [Bug fix 5528695] VLM NVBug fix #355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…vila Signed-off-by: Yue <[email protected]>
WalkthroughThe PR updates model type handling in examples/llm_ptq/hf_ptq.py by recalculating model_type from the moved submodel. It also modifies examples/vlm_ptq/scripts/huggingface_example.sh to detect "vila" models via a case-insensitive substring match on MODEL_NAME, adjusting dependency installation and repo cloning conditions. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Script as hf_ptq.py
participant Model
participant SubModel as model.language_model
User->>Script: Run PTQ example
Script->>Model: Load model
Script->>SubModel: Access language_model
Note right of Script: Recompute model_type = get_model_type(SubModel)
Script->>Script: Branch on updated model_type
Script-->>User: Proceed with PTQ steps based on submodel type
sequenceDiagram
autonumber
actor User
participant SH as huggingface_example.sh
participant Env as MODEL_NAME/MODEL_TYPE
User->>SH: Execute script with model params
SH->>Env: Read MODEL_NAME
Note right of SH: Check if lower(MODEL_NAME) contains "vila"
alt MODEL_NAME contains "vila"
SH->>SH: Install Vila deps & clone repo
else Not Vila
SH->>SH: Skip Vila-specific steps
end
SH-->>User: Continue with remaining setup
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
examples/vlm_ptq/scripts/huggingface_example.sh (1)
76-87
: Broader VILA detection is good; fix clone location for HF repo IDs.Substring check on MODEL_NAME works, but cloning to "$(dirname "$MODEL_PATH")/VILA" fails when MODEL_PATH is an HF repo id (no local parent dir). Prefer a safe base (ROOT_SAVE_PATH) when dirname doesn’t exist, avoid cwd side‑effects, and export PYTHONPATH so imports resolve.
Apply this diff:
-if [[ "${MODEL_NAME,,}" == *"vila"* ]]; then - # Install required dependency for VILA - pip install -r ../vlm_ptq/requirements-vila.txt - # Clone original VILA repo - if [ ! -d "$(dirname "$MODEL_PATH")/VILA" ]; then - echo "VILA repository is needed until it is added to HF model zoo. Cloning the repository parallel to $MODEL_PATH..." - git clone https://github.com/Efficient-Large-Model/VILA.git "$(dirname "$MODEL_PATH")/VILA" && \ - cd "$(dirname "$MODEL_PATH")/VILA" && \ - git checkout ec7fb2c264920bf004fd9fa37f1ec36ea0942db5 && \ - cd "$script_dir/.." - fi -fi +if [[ "${MODEL_NAME,,}" == *"vila"* ]]; then + # Install required dependency for VILA + python3 -m pip install -r ../vlm_ptq/requirements-vila.txt + # Clone original VILA repo next to local model path if it exists, otherwise under ROOT_SAVE_PATH. + if [ -d "$(dirname "$MODEL_PATH")" ]; then + CLONE_BASE="$(dirname "$MODEL_PATH")" + else + CLONE_BASE="$ROOT_SAVE_PATH" + fi + VILA_DIR="$CLONE_BASE/VILA" + if [ ! -d "$VILA_DIR/.git" ]; then + echo "VILA repository is needed until it is added to HF model zoo. Cloning into $VILA_DIR ..." + git clone https://github.com/Efficient-Large-Model/VILA.git "$VILA_DIR" + ( cd "$VILA_DIR" && git checkout ec7fb2c264920bf004fd9fa37f1ec36ea0942db5 ) + fi + export PYTHONPATH="$VILA_DIR:${PYTHONPATH:-}" +fi
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
examples/llm_ptq/hf_ptq.py
(1 hunks)examples/vlm_ptq/scripts/huggingface_example.sh
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
examples/llm_ptq/hf_ptq.py (2)
modelopt/torch/export/model_utils.py (1)
get_model_type
(66-71)modelopt/onnx/llm_export_utils/export_utils.py (1)
get_model_type
(47-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (1)
examples/vlm_ptq/scripts/huggingface_example.sh (1)
76-79
: LGTM — case-insensitive substring on MODEL_NAME verified.
Test script matches VILA1.5-3b, vila, fooVILAbar and not other.
model = model.language_model | ||
model_type = get_model_type(model) | ||
|
||
if args.sparsity_fmt != "dense": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recompute quantized-state (and device) after swapping to submodel.
Great call to refresh model_type on the language submodel. However, gating later on model_is_already_quantized (computed before the swap) can now be wrong for VLMs whose container is quantized but language_model is not. Also refresh device in case the submodule is on a different device.
Apply this diff:
model = model.language_model
- model_type = get_model_type(model)
+ model_type = get_model_type(model)
+ # Keep subsequent logic consistent with the sub‑model we actually operate on.
+ model_is_already_quantized = is_quantized(model)
+ device = getattr(model, "device", device)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
model = model.language_model | |
model_type = get_model_type(model) | |
if args.sparsity_fmt != "dense": | |
model = model.language_model | |
model_type = get_model_type(model) | |
# Keep subsequent logic consistent with the sub‑model we actually operate on. | |
model_is_already_quantized = is_quantized(model) | |
device = getattr(model, "device", device) | |
if args.sparsity_fmt != "dense": |
🤖 Prompt for AI Agents
In examples/llm_ptq/hf_ptq.py around lines 318 to 321, after swapping to the
language submodel (model = model.language_model) and refreshing model_type, also
recompute model_is_already_quantized and device there so they reflect the
submodel state (the container may be quantized while language_model is not, or
the submodule may be on a different device). Update the code to move or
duplicate the logic that sets model_is_already_quantized and device to
immediately follow the swap and model_type refresh, ensuring subsequent gates
(e.g., the args.sparsity_fmt != "dense" branch) use the corrected values.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #355 +/- ##
==========================================
- Coverage 73.84% 73.83% -0.01%
==========================================
Files 172 172
Lines 17453 17453
==========================================
- Hits 12888 12887 -1
- Misses 4565 4566 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
mtq.quantize(child, disabled_quant_cfg, forward_loop=None) | ||
|
||
model = model.language_model | ||
model_type = get_model_type(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need this model_type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only for int8_sq format, as later it will be exported to tensorrt_llm ckpt. Without line, the model_type will be unknown, as this nvbug shows.
Signed-off-by: Yue <[email protected]>
Signed-off-by: Yue <[email protected]>
Signed-off-by: Yue <[email protected]> Signed-off-by: Ye Yu <[email protected]>
Fix bugs for unknown model_type of llava and fix path dependency for vila
What does this PR do?
Type of change: ?
Bug fix
Overview: ?
This PR fixes two NVBugs: https://nvbugs/5528642 and https://nvbugs/5528695
Usage
# Add a code snippet demonstrating how to use this
Testing
Testing
Before your PR is "Ready for review"
Additional Information