-
Notifications
You must be signed in to change notification settings - Fork 190
Fix the fsdp cmd in llm_sparsity #417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughRemoved single quotes around two FSDP-related CLI arguments in Changes
Sequence Diagram(s)N/A Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/llm_sparsity/launch_finetune.sh(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: linux
- GitHub Check: build-docs
- GitHub Check: code-quality
| --fsdp full_shard auto_wrap \ | ||
| --fsdp_transformer_layer_cls_to_wrap LlamaDecoderLayer \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep the quotes on multi-word argument values.
Both --fsdp and --fsdp_transformer_layer_cls_to_wrap feed directly into finetune.py, whose argparse definitions (via transformers.TrainingArguments) expect a single string per flag. With the quotes removed, the shell now tokenizes auto_wrap as a separate argument, so the script sees --fsdp full_shard auto_wrap and errors with “unrecognized arguments: auto_wrap”. Same risk applies to the transformer layer flag. Please restore the quoting (or otherwise rejoin the values into a single token) to keep the command working.
🤖 Prompt for AI Agents
In examples/llm_sparsity/launch_finetune.sh around lines 94 to 95, the
multi-word values for --fsdp and --fsdp_transformer_layer_cls_to_wrap were split
into separate shell tokens; restore quoting (e.g., "--fsdp 'full_shard
auto_wrap'" and "--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer'") or
otherwise join each multi-word value into a single token so argparse receives
one string per flag.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #417 +/- ##
==========================================
- Coverage 73.36% 73.36% -0.01%
==========================================
Files 180 180
Lines 17919 17919
==========================================
- Hits 13147 13146 -1
- Misses 4772 4773 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@kaix-nv seems like your commits are not verified |
9ab1652 to
eb70a56
Compare
Signed-off-by: Kai Xu <[email protected]>
eb70a56 to
56232ad
Compare
What does this PR do?
Type of change: ?
Bug fix
Overview: ?
Fix the fsdp cmd in llm_sparsity. See slack.
Usage
# Add a code snippet demonstrating how to use this python hf_pts.py --model_name_or_path /home/scratch.omniml_data_1/models/llama3.1/Meta-Llama-3.1-8B \ --device cuda \ --model_max_length 1024 \ --dtype fp16 \ --sparsity_fmt sparsegpt \ --calib_size 128 \ --output_dir saved_models_Llama-2-7b-hf_sparsegpt_tp1_pp1/pts bash launch_finetune.sh --model /home/scratch.omniml_data_1/models/llama3.1/Meta-Llama-3.1-8B \ --max_length 1024 \ --num_epochs 3 \ --restore_path saved_models_Llama-2-7b-hf_sparsegpt_tp1_pp1/pts/pts_modelopt_state.pth \ --output_dir saved_models_Llama-2-7b-hf_sparsegpt_tp1_pp1/finetunedTesting
Before your PR is "Ready for review"
Additional Information
Summary by CodeRabbit