Skip to content

Conversation

@sugunav14
Copy link
Contributor

What does this PR do?

Type of change: Bug fix

Overview: Current context manager for FSDP2 aware weight update only works for modules with bias=False. Updated the code to handle modules with bias=True

Usage

# Add a code snippet demonstrating how to use this

Testing

accelerate launch --config_file ./fsdp2.yaml --machine_rank=0 --num_machines=1 --num_processes=4 --main_process_ip=10.126.7.122 --main_process_port=6000 --fsdp_transformer_layer_cls_to_wrap=Qwen2DecoderLayer ./multinode_ptq.py --pyt_ckpt_path Qwen/Qwen2-7B-Instruct --qformat fp8 --kv_cache_qformat fp8 --batch_size 24 --calib_size 64 --export_path B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --trust_remote_code

python /app/tensorrt_llm/examples/llm-api/quickstart_advanced.py --model_dir B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --enable_attention_dp --tp_size 1 --moe_ep_size 1 --kv_cache_fraction 0.6 --disable_kv_cache_reuse --max_batch_size 8 --max_num_tokens 1024 --trust_remote_code

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: N/A
  • Did you add or update any necessary documentation?: N/A
  • Did you update Changelog?: ?

Additional Information

NVBug [5711927]

@sugunav14 sugunav14 requested a review from a team as a code owner December 5, 2025 08:51
@sugunav14 sugunav14 self-assigned this Dec 5, 2025
@codecov
Copy link

codecov bot commented Dec 5, 2025

Codecov Report

❌ Patch coverage is 0% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.46%. Comparing base (c1c5ca0) to head (11d4cf4).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
modelopt/torch/quantization/utils.py 0.00% 22 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #651      +/-   ##
==========================================
- Coverage   74.66%   74.46%   -0.20%     
==========================================
  Files         183      183              
  Lines       18550    18409     -141     
==========================================
- Hits        13851    13709     -142     
- Misses       4699     4700       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sugunav14 sugunav14 requested review from a team as code owners December 5, 2025 20:02
@sugunav14 sugunav14 requested a review from realAsma December 5, 2025 23:28
@cjluo-nv
Copy link
Collaborator

cjluo-nv commented Dec 6, 2025

@sugunav14 could you share how bias is handled in the PR?

@sugunav14
Copy link
Contributor Author

@sugunav14 could you share how bias is handled in the PR?

Previously, I would iterate through modules to be updated and just register a new parameter for weight. Now, I iterate through the parameters of the modules to be updated (which could be just weight, or weight and bias)

Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
@sugunav14 sugunav14 force-pushed the svelury/nvbug-5711927 branch from fc0d6e8 to aa5f7df Compare December 7, 2025 21:41
Signed-off-by: Suguna Velury <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants