- 
                Notifications
    
You must be signed in to change notification settings  - Fork 190
 
[OMNIML-2791] Use nemotron post training dataset for calibration #420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
          
WalkthroughAdds a new CLI option  Changes
 Sequence Diagram(s)sequenceDiagram
  autonumber
  participant User
  participant Parser as parser.sh
  participant Example as huggingface_example.sh
  participant PTQ as hf_ptq.py
  participant DS as dataset_utils.py
  User->>Parser: provide options (may include calib_seq)
  Parser-->>Parser: parse, set CALIB_SEQ (default 512)
  Parser->>Example: export CALIB_SEQ
  Example->>PTQ: invoke hf_ptq.py (includes --calib_seq if set)
  PTQ->>PTQ: parse args (calib_seq) and fix calib_size to dataset count
  PTQ->>DS: get_dataset_dataloader(max_sample_length=calib_seq)
  DS-->>PTQ: dataloader (Nemotron datasets available)
  PTQ->>DS: get_max_batch_size(..., max_sample_length=calib_seq)
  DS-->>PTQ: batch size computed (memory thresholds up to 512)
  PTQ->>PTQ: run calibration / sparsification / quantization / export with calib_seq
    Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
 Pre-merge checks and finishing touches❌ Failed checks (1 warning)
 ✅ Passed checks (2 passed)
 ✨ Finishing touches
 🧪 Generate unit tests (beta)
 📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
 ✅ Files skipped from review due to trivial changes (1)
 ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
 🔇 Additional comments (4)
 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment   | 
    
Signed-off-by: Chenjie Luo <[email protected]>
86bcc39    to
    dc222fb      
    Compare
  
    
          Codecov Report❌ Patch coverage is  
 Additional details and impacted files@@           Coverage Diff           @@
##             main     #420   +/-   ##
=======================================
  Coverage   73.36%   73.37%           
=======================================
  Files         180      180           
  Lines       17919    17934   +15     
=======================================
+ Hits        13147    13159   +12     
- Misses       4772     4775    +3     ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
  | 
    
| - Add LoRA mode support for MCore in a new peft submodule: ``modelopt.torch.peft.update_model(model, LORA_CFG)``. | ||
| - Support PTQ and fakequant in vLLM for fast evaluation of arbitrary quantization formats. See ``examples/vllm_serve`` for more details. | ||
| - Add support for ``nemotron-post-training-dataset-v2`` and ``nemotron-post-training-dataset-v1`` in ``examples/llm_ptq``. Default to ``nemotron-post-training-dataset-v2`` if no dataset is specified. | ||
| - Allow specifying ``calib_seq`` in ``examples/llm_ptq`` to set the maximum sequence length for calibration. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general question, what is the difference between calib_size and calib_seq?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calib_size is the num of calib samples.
calib_seq is the length of the max calib sample sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
@cjluo-nv should we use be more verbose and use --calib_seq_len for clarity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK. Personally I prefer shorter flags.
        
          
                CHANGELOG.rst
              
                Outdated
          
        
      | - Add flag ``op_types_to_exclude_fp16`` in ONNX quantization to exclude ops from being converted to FP16/BF16. Alternatively, for custom TensorRT ops, this can also be done by indicating ``'fp32'`` precision in ``trt_plugins_precision``. | ||
| - Add LoRA mode support for MCore in a new peft submodule: ``modelopt.torch.peft.update_model(model, LORA_CFG)``. | ||
| - Support PTQ and fakequant in vLLM for fast evaluation of arbitrary quantization formats. See ``examples/vllm_serve`` for more details. | ||
| - Add support for ``nemotron-post-training-dataset-v2`` and ``nemotron-post-training-dataset-v1`` in ``examples/llm_ptq``. Default to ``nemotron-post-training-dataset-v2`` if no dataset is specified. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Add support for ``nemotron-post-training-dataset-v2`` and ``nemotron-post-training-dataset-v1`` in ``examples/llm_ptq``. Default to ``nemotron-post-training-dataset-v2`` if no dataset is specified. | |
| - Add support for ``nemotron-post-training-dataset-v2`` and ``nemotron-post-training-dataset-v1`` in ``examples/llm_ptq``. Default changed from ``cnn_dailymail`` to ``nemotron-post-training-dataset-v2`` if no dataset is specified. | 
Signed-off-by: Chenjie Luo <[email protected]>
What does this PR do?
Type of change: ? Update default calibration dataset
Overview: ?
We now plan to use the mixture of the
cnn_dailymailandnvidia/Nemotron-Post-Training-Dataset-v2as the default dataset as it shows overall the same or better model accuracy after PTQ, especially for AWQ tasks.This PR also
Testing
Benchmarked with gpqa and AIME comparing cnn_dailymail calibration vs nemotron dataset calibration.
Summary by CodeRabbit