Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions examples/llm_sparsity/launch_finetune.sh
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,8 @@ CMD="accelerate launch --multi_gpu --mixed_precision bf16 finetune.py \
--warmup_ratio 0.0 \
--lr_scheduler_type cosine \
--logging_steps 1 \
--fsdp 'full_shard auto_wrap' \
--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
--fsdp full_shard auto_wrap \
--fsdp_transformer_layer_cls_to_wrap LlamaDecoderLayer \
Comment on lines +94 to +95
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Keep the quotes on multi-word argument values.

Both --fsdp and --fsdp_transformer_layer_cls_to_wrap feed directly into finetune.py, whose argparse definitions (via transformers.TrainingArguments) expect a single string per flag. With the quotes removed, the shell now tokenizes auto_wrap as a separate argument, so the script sees --fsdp full_shard auto_wrap and errors with “unrecognized arguments: auto_wrap”. Same risk applies to the transformer layer flag. Please restore the quoting (or otherwise rejoin the values into a single token) to keep the command working.

🤖 Prompt for AI Agents
In examples/llm_sparsity/launch_finetune.sh around lines 94 to 95, the
multi-word values for --fsdp and --fsdp_transformer_layer_cls_to_wrap were split
into separate shell tokens; restore quoting (e.g., "--fsdp 'full_shard
auto_wrap'" and "--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer'") or
otherwise join each multi-word value into a single token so argparse receives
one string per flag.

--tf32 True \
--modelopt_restore_path $MODELOPT_RESTORE_PATH \
--report_to tensorboard \
Expand Down
1 change: 1 addition & 0 deletions examples/llm_sparsity/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
flash-attn
sentencepiece>=0.2.0
tensorboardX
transformers>=4.57.0