Skip to content

Commit 2350af8

Browse files
authored
add_ppminilm_demo (PaddlePaddle#1082)
1 parent 84eb98d commit 2350af8

File tree

10 files changed

+320
-29
lines changed

10 files changed

+320
-29
lines changed

demo/auto_compression/nlp/README.md

Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
# 自然语言处理模型自动压缩示例
2+
3+
本示例将介绍如何使用PaddleNLP中Inference部署模型进行自动压缩。
4+
5+
## Benchmark
6+
- PP-MiniLM模型
7+
8+
PP-MiniLM是一个6层的预训练中文小模型,使用PaddleNLP中``from_pretrained``导入PP-MiniLM之后,就可以在自己的数据集上进行fine-tuning,具体介绍可参考[PP-MiniLM文档](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/model_compression/pp-minilm#PP-MiniLM%E4%B8%AD%E6%96%87%E5%B0%8F%E6%A8%A1%E5%9E%8B)
9+
此自动压缩实验首先会对模型的attention head裁剪25%,同时进行蒸馏训练,然后进行离线量化(Post-training quantization)。
10+
11+
| 模型 | 策略 | AFQMC | TNEWS | IFLYTEK | CMNLI | OCNLI | CLUEWSC2020 | CSL | AVG |
12+
|:------:|:------:|:------:|:------:|:------:|:------:|:-----------:|:------:|:------:|:------:|
13+
| PP-MiniLM | Base模型| 74.03 | 56.66 | 60.21 | 80.98 | 76.20 | 84.21 | 77.36 | 72.81 |
14+
| PP-MiniLM |剪枝蒸馏+离线量化| 73.56 | 56.38 | 59.87 | 80.80 | 76.44 | 82.23 | 77.77 | 72.44 |
15+
16+
性能测试的环境为
17+
- 硬件:NVIDIA Tesla T4 单卡
18+
- 软件:CUDA 11.0, cuDNN 8.0, TensorRT 8.0
19+
- 测试配置:batch_size: 40, max_seq_len: 128
20+
21+
## 环境准备
22+
23+
### 1.准备数据
24+
本案例默认以CLUE数据进行自动压缩实验,如数据集为非CLUE格式数据,请修改启动文本run.sh中dataset字段,PaddleNLP会自动下载对应数据集。
25+
26+
### 2.准备需要压缩的环境
27+
- python >= 3.6
28+
- paddlepaddle >= 2.3
29+
- PaddleNLP >= 2.3
30+
31+
安装paddlepaddle:
32+
```shell
33+
# CPU
34+
pip install paddlepaddle
35+
# GPU
36+
pip install paddlepaddle-gpu
37+
```
38+
39+
安装paddlenlp:
40+
```shell
41+
pip install paddlenlp
42+
```
43+
44+
安装paddleslim:
45+
```shell
46+
pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
47+
```
48+
49+
注:安装PaddleNLP的目的是为了下载PaddleNLP中的数据集和Tokenizer。
50+
51+
### 3.准备待压缩的部署模型
52+
如果已经准备好部署的model.pdmodel和model.pdiparams部署模型,跳过此步。
53+
根据[PaddleNLP文档](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples)导出Inference模型,本示例可参考[PaddleNLP PP-MiniLM 中文小模型](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/model_compression/pp-minilm)微调后保存下每个数据集下有最高准确率的模型。或直接下载以下已微调完成的Inference模型:[afqmc](https://bj.bcebos.com/v1/paddle-slim-models/act/afqmc.tar), [tnews](https://bj.bcebos.com/v1/paddle-slim-models/act/tnews.tar), [iflytek](https://bj.bcebos.com/v1/paddle-slim-models/act/iflytek.tar),[ ocnli](https://bj.bcebos.com/v1/paddle-slim-models/act/ocnli.tar), [cmnli](https://bj.bcebos.com/v1/paddle-slim-models/act/cmnli.tar), [cluewsc2020](https://bj.bcebos.com/v1/paddle-slim-models/act/cluewsc.tar), [csl](https://bj.bcebos.com/v1/paddle-slim-models/act/csl.tar)
54+
```shell
55+
wget https://bj.bcebos.com/v1/paddle-slim-models/act/afqmc.tar
56+
tar -zxvf afqmc.tar
57+
```
58+
59+
## 开始自动压缩
60+
61+
### 压缩配置介绍
62+
自动压缩需要准备config文件,并传入``config_path``字段,configs文件夹下可查看不同任务的配置文件,以下示例以afqmc数据集为例介绍。训练参数需要自行配置。蒸馏、剪枝和离线量化的相关配置,自动压缩策略可以自动获取得到,也可以自行配置。PaddleNLP模型的自动压缩实验默认使用剪枝、蒸馏和离线量化的策略。
63+
64+
- 训练参数
65+
66+
训练参数主要设置学习率、训练轮数(epochs)和优化器等。``origin_metric``是原模型精度,如设置该参数,压缩之前会先验证模型精度是否正常。
67+
68+
```yaml
69+
TrainConfig:
70+
epochs: 6
71+
eval_iter: 1070
72+
learning_rate: 2.0e-5
73+
optim_args:
74+
weight_decay: 0.01
75+
optimizer: AdamW
76+
origin_metric: 0.7403
77+
```
78+
79+
以下是默认的蒸馏、剪枝和离线量化的配置:
80+
81+
- 蒸馏参数
82+
83+
蒸馏参数包括teacher网络模型路径(即微调后未剪枝的模型),自动压缩策略会自动查找教师网络节点和对应的学生网络节点进行蒸馏,不需要手动设置。
84+
85+
```yaml
86+
Distillation:
87+
teacher_model_dir: ./afqmc/
88+
teacher_model_filename: inference.pdmodel
89+
teacher_params_filename: inference.pdiparams
90+
```
91+
92+
- 剪枝参数
93+
94+
剪枝参数包括裁剪算法和裁剪度。
95+
96+
```yaml
97+
Prune:
98+
prune_algo: transformer_pruner
99+
pruned_ratio: 0.25
100+
```
101+
102+
- 优化参数
103+
104+
```yaml
105+
HyperParameterOptimization:
106+
batch_num:
107+
- 4
108+
- 16
109+
bias_correct:
110+
- true
111+
hist_percent:
112+
- 0.999
113+
- 0.99999
114+
max_quant_count: 20
115+
ptq_algo:
116+
- KL
117+
- hist
118+
weight_quantize_type:
119+
- channel_wise_abs_max
120+
```
121+
122+
- 量化参数
123+
124+
量化参数主要设置量化比特数和量化op类型,其中量化op包含卷积层(conv2d, depthwise_conv2d)和全连接层(mul,matmul_v2)。
125+
126+
```yaml
127+
Quantization:
128+
activation_bits: 8
129+
quantize_op_types:
130+
- conv2d
131+
- depthwise_conv2d
132+
- mul
133+
- matmul_v2
134+
weight_bits: 8
135+
```
136+
137+
### 进行剪枝蒸馏和离线量化自动压缩
138+
139+
蒸馏量化自动压缩示例通过run.py脚本启动,会使用接口``paddleslim.auto_compression.AutoCompression``对模型进行离线量化。将任务名称、模型类型、数据集名称、压缩参数传入,对模型进行剪枝、蒸馏训练和离线量化。数据集为CLUE,不同任务名称代表CLUE上不同的任务,可选择的任务名称有:afqmc, tnews, iflytek, ocnli, cmnli, cluewsc2020, csl。具体运行命令为:
140+
```shell
141+
python run.py \
142+
--model_type='ppminilm' \
143+
--model_dir='./afqmc/' \
144+
--model_filename='inference.pdmodel' \
145+
--params_filename='inference.pdiparams' \
146+
--dataset='clue' \
147+
--save_dir='./save_afqmc_pruned/' \
148+
--batch_size=16 \
149+
--max_seq_length=128 \
150+
--task_name='afqmc' \
151+
--config_path='./configs/afqmc.yaml'
152+
```
153+
154+
155+
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
TrainConfig:
2+
epochs: 6
3+
eval_iter: 1070
4+
learning_rate: 2.0e-5
5+
optim_args:
6+
weight_decay: 0.01
7+
optimizer: AdamW
8+
origin_metric: 0.7403
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
TrainConfig:
2+
epochs: 100
3+
eval_iter: 70
4+
learning_rate: 1.0e-5
5+
optim_args:
6+
weight_decay: 0.01
7+
optimizer: AdamW
8+
origin_metric: 0.8421
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
TrainConfig:
2+
epochs: 6
3+
eval_iter: 2000
4+
learning_rate: 3.0e-5
5+
optim_args:
6+
weight_decay: 0.01
7+
optimizer: AdamW
8+
origin_metric: 0.8098
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
TrainConfig:
2+
epochs: 16
3+
eval_iter: 1000
4+
learning_rate: 1.0e-5
5+
optim_args:
6+
weight_decay: 0.01
7+
optimizer: AdamW
8+
origin_metric: 0.7736
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
TrainConfig:
2+
epochs: 12
3+
eval_iter: 750
4+
learning_rate: 2.0e-5
5+
optim_args:
6+
weight_decay: 0.01
7+
optimizer: AdamW
8+
origin_metric: 0.6021
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
TrainConfig:
2+
epochs: 20
3+
eval_iter: 1050
4+
learning_rate: 3.0e-5
5+
optim_args:
6+
weight_decay: 0.01
7+
optimizer: AdamW
8+
origin_metric: 0.7620
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
TrainConfig:
2+
epochs: 6
3+
eval_iter: 1110
4+
learning_rate: 2.0e-5
5+
optim_args:
6+
weight_decay: 0.01
7+
optimizer: AdamW
8+
origin_metric: 0.5666

0 commit comments

Comments
 (0)