-
Notifications
You must be signed in to change notification settings - Fork 206
[Bug Fix]: NVBug 5711927 #651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #651 +/- ##
==========================================
- Coverage 74.66% 74.46% -0.20%
==========================================
Files 183 183
Lines 18550 18409 -141
==========================================
- Hits 13851 13709 -142
- Misses 4699 4700 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@sugunav14 could you share how bias is handled in the PR? |
Previously, I would iterate through modules to be updated and just register a new parameter for weight. Now, I iterate through the parameters of the modules to be updated (which could be just weight, or weight and bias) |
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
fc0d6e8 to
aa5f7df
Compare
Signed-off-by: Suguna Velury <[email protected]>
What does this PR do?
Type of change: Bug fix
Overview: Current context manager for FSDP2 aware weight update only works for modules with bias=False. Updated the code to handle modules with bias=True
Usage
# Add a code snippet demonstrating how to use thisTesting
accelerate launch --config_file ./fsdp2.yaml --machine_rank=0 --num_machines=1 --num_processes=4 --main_process_ip=10.126.7.122 --main_process_port=6000 --fsdp_transformer_layer_cls_to_wrap=Qwen2DecoderLayer ./multinode_ptq.py --pyt_ckpt_path Qwen/Qwen2-7B-Instruct --qformat fp8 --kv_cache_qformat fp8 --batch_size 24 --calib_size 64 --export_path B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --trust_remote_codepython /app/tensorrt_llm/examples/llm-api/quickstart_advanced.py --model_dir B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --enable_attention_dp --tp_size 1 --moe_ep_size 1 --kv_cache_fraction 0.6 --disable_kv_cache_reuse --max_batch_size 8 --max_num_tokens 1024 --trust_remote_codeBefore your PR is "Ready for review"
Additional Information
NVBug [5711927]