-
Notifications
You must be signed in to change notification settings - Fork 206
Address security concerns in code #626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
1c03c81 to
45b288f
Compare
Signed-off-by: Keval Morabia <[email protected]>
45b288f to
8a6c466
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #626 +/- ##
=======================================
Coverage 74.64% 74.64%
=======================================
Files 183 183
Lines 18542 18542
=======================================
Hits 13840 13840
Misses 4702 4702 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
WalkthroughThis pull request adds security documentation and validation across the ModelOpt codebase. Changes include security comments clarifying safe deserialization of ModelOpt-generated states, removal of pickle-based caching from data loading, addition of a Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
modelopt/onnx/quantization/__main__.py (1)
50-59: Fix mutual-exclusion:--trust_calibration_datacannot be used with--calibration_data_pathPlacing
--trust_calibration_datain the same mutually exclusive group as--calibration_data_pathand--calibration_cache_pathprevents users from enabling trusted pickle deserialization for a provided calibration file. The loading code at lines 271–287 explicitly expects both flags to be combinable, but argparse will reject--calibration_data_path ... --trust_calibration_datadue to the mutual exclusion constraint.The loading logic itself is sound—defaulting to
allow_pickle=False, converting.npzto a dict, and emitting a clear error when pickles are needed without the trust flag. The issue is purely the argparse wiring.Move
--trust_calibration_dataout of the mutually exclusive group:- group.add_argument( - "--trust_calibration_data", - action="store_true", - help="If True, trust the calibration data and allow pickle deserialization.", - ) + parser.add_argument( + "--trust_calibration_data", + action="store_true", + help=( + "If True, trust the calibration data and allow pickle deserialization when " + "using --calibration_data_path." + ), + )The same issue applies to lines 271–287, where the code attempts to use both flags together.
🧹 Nitpick comments (1)
modelopt/torch/quantization/plugins/attention.py (1)
210-221: AST-based security validation is reasonable but may be overly strictThe refactor to
_create_quantized_class_from_ast(typed signature, optionaltemp_file_name, and the explicit note aroundcompile()on an internally generated AST) looks good, and the compile-based path avoidsexecon generated code.The new string-based scan for suspicious patterns is conservative, but it can raise
ValueErrorfor otherwise safe classes whose source or docstrings merely mention tokens likeeval,compile, oros.system. That would fail registration at import time.If you start seeing false positives in practice, consider tightening this by:
- Inspecting the AST for actual
ast.Callnodes to these symbols instead of scanning the unparsed source string, or- Downgrading to a warning and skipping registration for that class (returning
Falsefromregister_attention_for_kv_quant) rather than raising.As-is, this is acceptable from a security standpoint; the above would just make it more robust against benign occurrences of these substrings.
Also applies to: 238-249, 271-275
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
docs/source/guides/2_save_load.rst(1 hunks)examples/llm_qat/export.py(1 hunks)examples/llm_sparsity/README.md(1 hunks)examples/llm_sparsity/finetune.py(1 hunks)modelopt/onnx/quantization/__main__.py(2 hunks)modelopt/torch/export/distribute.py(1 hunks)modelopt/torch/opt/conversion.py(1 hunks)modelopt/torch/opt/plugins/huggingface.py(1 hunks)modelopt/torch/opt/plugins/mcore_dist_checkpointing.py(1 hunks)modelopt/torch/opt/plugins/megatron.py(1 hunks)modelopt/torch/opt/plugins/peft.py(2 hunks)modelopt/torch/opt/searcher.py(1 hunks)modelopt/torch/quantization/plugins/attention.py(3 hunks)modelopt/torch/quantization/plugins/transformers_trainer.py(1 hunks)modelopt/torch/utils/distributed.py(1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-15T20:46:29.252Z
Learnt from: realAsma
Repo: NVIDIA/TensorRT-Model-Optimizer PR: 318
File: modelopt/torch/quantization/plugins/transformers_trainer.py:170-189
Timestamp: 2025-09-15T20:46:29.252Z
Learning: In modelopt/torch/quantization/plugins/transformers_trainer.py, the restore_from_modelopt_state function can accept modelopt_state["modelopt_state_dict"] directly without needing to wrap it in a full dict structure or include modelopt_version.
Applied to files:
docs/source/guides/2_save_load.rstmodelopt/torch/opt/plugins/mcore_dist_checkpointing.pymodelopt/torch/quantization/plugins/transformers_trainer.pymodelopt/torch/opt/conversion.pyexamples/llm_qat/export.pymodelopt/torch/opt/plugins/peft.pymodelopt/torch/opt/plugins/huggingface.py
📚 Learning: 2025-09-16T21:46:46.344Z
Learnt from: realAsma
Repo: NVIDIA/TensorRT-Model-Optimizer PR: 318
File: modelopt/torch/quantization/plugins/transformers_trainer.py:206-212
Timestamp: 2025-09-16T21:46:46.344Z
Learning: In modelopt/torch/quantization/plugins/transformers_trainer.py, the mtq.quantize function calls the forward_loop under a no_grad context, so wrapping the forward_loop in inference_mode or no_grad is not needed.
Applied to files:
modelopt/torch/quantization/plugins/transformers_trainer.py
📚 Learning: 2025-09-16T20:14:34.768Z
Learnt from: realAsma
Repo: NVIDIA/TensorRT-Model-Optimizer PR: 318
File: modelopt/torch/quantization/plugins/transformers_trainer.py:191-191
Timestamp: 2025-09-16T20:14:34.768Z
Learning: The TensorRT-Model-Optimizer project only supports PyTorch >= 2.6, so using the `weights_only` parameter in torch.load calls is acceptable and doesn't require backward compatibility handling.
Applied to files:
modelopt/torch/quantization/plugins/transformers_trainer.pymodelopt/torch/opt/conversion.py
📚 Learning: 2025-09-15T16:40:12.799Z
Learnt from: realAsma
Repo: NVIDIA/TensorRT-Model-Optimizer PR: 318
File: modelopt/torch/quantization/plugins/transformers_trainer.py:206-208
Timestamp: 2025-09-15T16:40:12.799Z
Learning: In modelopt/torch/quantization/plugins/transformers_trainer.py, the forward_loop function in _quantize_model should use self.model(**batch) rather than the model parameter passed to forward_loop. The model parameter should not be used for the forward pass.
Applied to files:
modelopt/torch/quantization/plugins/transformers_trainer.py
🧬 Code graph analysis (1)
examples/llm_sparsity/finetune.py (1)
modelopt/torch/utils/logging.py (1)
print_rank_0(92-95)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: example-tests-pr (llm_ptq)
- GitHub Check: gpu-tests-pr
🔇 Additional comments (13)
modelopt/torch/opt/searcher.py (1)
252-253: LGTM! Security note clarifies safe deserialization context.The security comment appropriately documents that
weights_only=Falseis used for ModelOpt-generated checkpoints, not untrusted input.modelopt/torch/quantization/plugins/transformers_trainer.py (1)
191-192: LGTM! Security note clarifies safe deserialization context.The security comment appropriately documents that
weights_only=Falseis used for ModelOpt-generated state_dict, not untrusted input.modelopt/torch/utils/distributed.py (1)
90-91: LGTM! Security note clarifies safe deserialization context.The security comment appropriately documents that
weights_only=Falseis used for internally-generated buffers within the distributed process group, not untrusted input.examples/llm_sparsity/finetune.py (1)
235-244: LGTM! Pickle caching removal improves security posture.Removing pickle-based caching eliminates a potential security risk by ensuring data is always loaded fresh from the trusted source. This aligns with the PR's security objectives.
modelopt/torch/opt/plugins/megatron.py (1)
79-79: Note: Pickle replacement is pending per PR objectives.This file only updates a comment URL, but the PR description explicitly states that pickle usage replacement in this file is still pending. The file continues to use
pickle.dumps(line 59) andpickle.loads(line 80) with# nosecsuppressions.Consider either:
- Completing the pickle replacement in this PR following the TransformerEngine approach (referenced in PR description), or
- Removing this file from the current PR and tracking the pickle replacement separately.
Including incomplete work may cause confusion about the PR's security improvements.
modelopt/torch/opt/plugins/huggingface.py (1)
82-83: LGTM! Security note clarifies safe deserialization context.The security comment appropriately documents that
weights_only=Falseis used for ModelOpt-generated state_dict, not untrusted input.docs/source/guides/2_save_load.rst (1)
132-133: LGTM! Documentation update educates users on safe deserialization.The security comment in the documentation example helps users understand the appropriate context for using
weights_only=Falsewith ModelOpt-generated state_dicts.modelopt/torch/opt/conversion.py (1)
529-532: LGTM! Docstring update educates users on safe deserialization.The security comment in the function docstring example helps users understand the appropriate context for using
weights_only=Falsewith ModelOpt-generated state_dicts.modelopt/torch/opt/plugins/mcore_dist_checkpointing.py (1)
244-248: Clarifying security note looks goodThe added comment correctly scopes the
weights_only=Falseusage to NVIDIA-generated checkpoints and doesn’t affect behavior.examples/llm_qat/export.py (1)
52-56: Security note around modelopt_state load is consistentThe comment clearly documents that
weights_only=Falseis only used on ModelOpt-generated state in a trusted path; no code changes are introduced.Please ensure the minimum supported PyTorch version in this repo accepts the
weights_onlyargument ontorch.load.modelopt/torch/opt/plugins/peft.py (1)
71-78: Documented trust boundary for PEFT state loads is appropriateBoth comments correctly explain that
weights_only=Falseis used only for ModelOpt-generated adapter and quantizer state, not arbitrary user files; this aligns with the PR’s TAVA guidance.Please confirm the project’s PyTorch version supports the
weights_onlykwarg everywhere it is now passed totorch.load.Also applies to: 86-92
modelopt/torch/export/distribute.py (1)
90-96: NFSWorkspace security comment is clear and non-intrusiveThe note accurately states that
weights_only=Falseis used only on internally produced ModelOpt checkpoints in the shared NFS directory; runtime behavior is preserved.As with other call sites, please ensure supported PyTorch versions accept
weights_onlyintorch.load.examples/llm_sparsity/README.md (1)
85-88: README update aligns with removal of pickle-based cachingSimplifying the SAT data-prep note to just describe tokenization length avoids mentioning the removed pickle cache behavior and keeps docs accurate.
modelopt/torch/opt/plugins/megatron.py- Follows same approach as TransformerEngine from https://github.com/NVIDIA/TransformerEngine/blob/3ff0b8d4/transformer_engine/pytorch/module/base.py#L863 (alternatives?)Summary by CodeRabbit
New Features
--trust_calibration_dataCLI flag for secure ONNX quantization with pickle data files.Improvements
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.