-
Notifications
You must be signed in to change notification settings - Fork 2.1k
【Bug】Fix attn_mask_startend_row_indices shape mismatch #2564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,18 +20,39 @@ | |
wget https://bj.bcebos.com/paddlenlp/datasets/examples/alpaca_demo.gz | ||
tar -xvf alpaca_demo.gz | ||
``` | ||
### 模型下载 | ||
```bash | ||
# PaddleNLP/Qwen2-0.5B-Instruct | ||
aistudio download --model PaddleNLP/Qwen2-0.5B-Instruct --local_dir PaddleNLP/Qwen2-0.5B-Instruct | ||
|
||
# baidu/ERNIE-4.5-0.3B-PT | ||
aistudio download --model PaddlePaddle/ERNIE-4.5-0.3B-PT --local_dir baidu/ERNIE-4.5-0.3B-PT | ||
|
||
# baidu/ERNIE-4.5-0.3B-PT | ||
aistudio download --model PaddlePaddle/ERNIE-4.5-21B-A3B-PT --local_dir baidu/ERNIE-4.5-21B-A3B-PT | ||
``` | ||
|
||
### 全参精调:SFT | ||
|
||
单卡 | ||
```bash | ||
# 需要12G显存左右 | ||
# 微调Qwen2-0.5B-Instruct 需要12G显存左右 | ||
python -u run_finetune.py ./config/qwen/sft_argument_qwen2_0p5b.json | ||
|
||
# 微调ERNIE-4.5-0.3B-PT | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这里先不用加ernie 4.5。先保证这ERNIEKit中能够训练就行 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 文档可以不改 |
||
python -u run_finetune.py ./config/ernie4_5/sft_argument_ernie4_5_0p3b.json | ||
``` | ||
|
||
多卡 | ||
```bash | ||
# SFT Qwen2-0.5B-Instruct | ||
python -u -m paddle.distributed.launch --devices "0,1,2,3,4,5,6,7" run_finetune.py ./config/qwen/sft_argument_qwen2_0p5b.json | ||
|
||
# SFT ERNIE-4.5-0.3B-PT | ||
python -u -m paddle.distributed.launch --devices "0,1,2,3,4,5,6,7" run_finetune.py ./config/ernie4_5/sft_argument_ernie4_5_0p3b.json | ||
|
||
# SFT ERNIE-4.5-21B-A3B-PT | ||
python -u -m paddle.distributed.launch --devices "0,1,2,3,4,5,6,7" run_finetune.py ./config/ernie4_5_moe/sft_argument_ernie4_5_21b_a3b.json | ||
``` | ||
|
||
### LoRA | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1157,7 +1157,8 @@ def forward( | |
# Pretrain & Eval must have labels | ||
assert labels is not None | ||
|
||
return self.criterion(logits, labels, loss_mask, router_loss=router_loss, mtp_logits=mtp_logits) | ||
loss, _ = self.criterion(logits, labels, loss_mask, router_loss=router_loss, mtp_logits=mtp_logits) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这是为什么? |
||
return loss, logits | ||
|
||
|
||
class Ernie4_5_MoeForCausalLMPipe(GeneralModelForCausalLMPipe): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不需要加下载文档,后面可以直接用from_pretrained方式下载了