Skip to content

Commit dcaa046

Browse files
authored
add demo for auto-compress (PaddlePaddle#1078)
1 parent 95665af commit dcaa046

File tree

4 files changed

+471
-0
lines changed

4 files changed

+471
-0
lines changed
Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# 使用预测模型进行自动压缩示例
2+
3+
本示例将介绍如何使用PaddleSeg中预测模型进行自动压缩训练。
4+
5+
[PP-HumanSeg-Lite](https://github.com/PaddlePaddle/PaddleSeg/tree/develop/contrib/PP-HumanSeg#portrait-segmentation)模型为例,使用自动压缩接口分别进行了蒸馏稀疏训练和蒸馏量化训练实验,并在SD710上使用单线程测试加速效果,其压缩结果和测速结果如下所示:
6+
| 压缩方式 | Total IoU | 耗时(ms)<br>thread=1 | 加速比 |
7+
|:-----:|:----------:|:---------:| :------:|
8+
| Baseline | 0.9287 | 56.363 | - |
9+
| 非结构化稀疏 | 0.9235 | 37.712 | 49.456% |
10+
| 量化 | 0.9284 | 49.656 | 13.506% |
11+
12+
## 自动压缩训练流程
13+
14+
### 1. 准备数据集
15+
16+
参考[PaddleSeg数据准备文档](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.5/docs/data/marker/marker_cn.md)
17+
18+
### 2. 准备待压缩模型
19+
20+
PaddleSeg 是基于飞桨 PaddlePaddle 开发的端到端图像分割开发套件,涵盖了高精度和轻量级等不同方向的大量高质量分割模型。
21+
安装 PaddleSeg 指令如下:
22+
```
23+
pip install paddleseg
24+
```
25+
PaddleSeg 环境依赖详见[安装文档](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/install_cn.md)
26+
27+
#### 2.1 下载代码
28+
```
29+
git clone https://github.com/PaddlePaddle/PaddleSeg.git
30+
```
31+
#### 2.2 准备预训练模型
32+
33+
在 PaddleSeg 目录下执行如下指令,下载预训练模型。
34+
``` shell
35+
wget https://paddleseg.bj.bcebos.com/dygraph/ppseg/ppseg_lite_portrait_398x224.tar.gz
36+
tar -xzf ppseg_lite_portrait_398x224.tar.gz
37+
```
38+
#### 2.3 导出预测模型
39+
40+
在 PaddleSeg 目录下执行如下命令,则预测模型会保存在 inference_model 文件夹。
41+
```shell
42+
# 设置1张可用的卡
43+
export CUDA_VISIBLE_DEVICES=0
44+
# windows下请执行以下命令
45+
# set CUDA_VISIBLE_DEVICES=0
46+
python export.py \
47+
--config configs/pp_humanseg_lite/pp_humanseg_lite_export_398x224.yml \
48+
--model_path ppseg_lite_portrait_398x224/model.pdparams \
49+
--save_dir inference_model
50+
--with_softmax
51+
```
52+
或直接下载 PP-HumanSeg-Lite 的预测模型:
53+
```shell
54+
wegt https://paddleseg.bj.bcebos.com/dygraph/ppseg/ppseg_lite_portrait_398x224_with_softmax.tar.gz
55+
tar -xzf ppseg_lite_portrait_398x224_with_softmax.tar.gz
56+
```
57+
58+
### 3. 多策略融合压缩
59+
60+
每一个小章节代表一种多策略融合压缩方式。
61+
62+
### 3.1 进行蒸馏稀疏压缩
63+
自动压缩训练需要准备 config 文件、数据集 dataloader 以及测试函数(``eval_function``)。
64+
#### 3.1.1 配置config
65+
66+
使用自动压缩进行蒸馏和非结构化稀疏的联合训练,首先要配置 config 文件,包含蒸馏、稀疏和训练三部分参数。
67+
68+
- 蒸馏参数
69+
70+
蒸馏参数主要设置蒸馏节点(``distill_node_pair``)和教师网络测预测模型路径。蒸馏节点需包含教师网络节点和对应的学生网络节点,其中教师网络节点名称将在程序中自动添加 “teacher_” 前缀,如下所示。
71+
```yaml
72+
Distillation:
73+
distill_lambda: 1.0
74+
distill_loss: l2_loss
75+
distill_node_pair:
76+
- teacher_relu_30.tmp_0
77+
- relu_30.tmp_0
78+
merge_feed: true
79+
teacher_model_dir: ./inference_model
80+
teacher_model_filename: model.pdmodel
81+
teacher_params_filename: model.pdiparams
82+
```
83+
- 稀疏参数
84+
85+
稀疏参数设置如下所示,其中参数含义详见[非结构化稀疏API文档](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/dygraph/pruners/unstructured_pruner.rst)。
86+
```yaml
87+
UnstructurePrune:
88+
prune_strategy: gmp
89+
prune_mode: ratio
90+
pruned_ratio: 0.75
91+
gmp_config:
92+
stable_iterations: 0
93+
pruning_iterations: 4500
94+
tunning_iterations: 4500
95+
resume_iteration: -1
96+
pruning_steps: 100
97+
initial_ratio: 0.15
98+
prune_params_type: conv1x1_only
99+
local_sparsity: True
100+
```
101+
102+
- 训练参数
103+
104+
训练参数主要设置学习率、训练次数(epochs)和优化器等。
105+
```yaml
106+
TrainConfig:
107+
epochs: 14
108+
eval_iter: 400
109+
learning_rate: 5.0e-03
110+
optimizer: SGD
111+
optim_args:
112+
weight_decay: 0.0005
113+
```
114+
#### 3.1.2 准备 dataloader 和测试函数
115+
准备好数据集后,需将训练数据封装成 dict 类型传入自动压缩接口,可参考以下函数进行封装。测试函数用于测试模型精度,需在静态图模式下实现。
116+
```python
117+
def reader_wrapper(reader):
118+
def gen():
119+
for i, data in enumerate(reader()):
120+
imgs = np.array(data[0])
121+
yield {"x": imgs}
122+
return gen
123+
```
124+
> 注:该dict类型的key值要和保存预测模型时的输入名称保持一致。
125+
126+
#### 3.1.3 开启训练
127+
128+
将训练数据集 dataloader 和测试函数传入接口 ``paddleslim.auto_compression.AutoCompression``,对模型进行非结构化稀疏训练。运行指令如下:
129+
```shell
130+
python run.py \
131+
--model_dir='inference_model' \
132+
--model_filename='inference.pdmodel' \
133+
--params_filename='./inference.pdiparams' \
134+
--save_dir='./save_model' \
135+
--config_path='configs/humanseg_sparse_dis.yaml'
136+
```
137+
138+
### 3.2 进行蒸馏量化压缩
139+
#### 3.2.1 配置config
140+
使用自动压缩进行量化训练,首先要配置config文件,包含蒸馏、量化和训练三部分参数。其中蒸馏和训练参数与稀疏训练类似,下面主要介绍量化参数的设置。
141+
- 量化参数
142+
143+
量化参数主要设置量化比特数和量化op类型,其中量化op包含卷积层(conv2d, depthwise_conv2d)和全连接层(matmul)。以下为只量化卷积层的示例:
144+
```yaml
145+
Quantization:
146+
activation_bits: 8
147+
weight_bits: 8
148+
is_full_quantize: false
149+
not_quant_pattern:
150+
- skip_quant
151+
quantize_op_types:
152+
- conv2d
153+
- depthwise_conv2d
154+
```
155+
#### 3.2.2 开启训练
156+
将数据集 dataloader 和测试函数(``eval_function``)传入接口``paddleslim.auto_compression.AutoCompression``,对模型进行量化训练。运行指令如下:
157+
```
158+
python run.py \
159+
--model_dir='inference_model' \
160+
--model_filename='inference.pdmodel' \
161+
--params_filename='./inference.pdiparams' \
162+
--save_dir='./save_model' \
163+
--config_path='configs/humanseg_quant_dis.yaml'
164+
```
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
Distillation:
2+
distill_lambda: 1.0
3+
distill_loss: l2_loss
4+
distill_node_pair:
5+
- teacher_reshape2_1.tmp_0 #
6+
- reshape2_1.tmp_0
7+
- teacher_reshape2_3.tmp_0 #
8+
- reshape2_3.tmp_0
9+
- teacher_reshape2_5.tmp_0 #
10+
- reshape2_5.tmp_0
11+
- teacher_reshape2_7.tmp_0 #block1
12+
- reshape2_7.tmp_0
13+
- teacher_reshape2_9.tmp_0 #
14+
- reshape2_9.tmp_0
15+
- teacher_reshape2_11.tmp_0 #
16+
- reshape2_11.tmp_0
17+
- teacher_reshape2_13.tmp_0 #
18+
- reshape2_13.tmp_0
19+
- teacher_reshape2_15.tmp_0 #
20+
- reshape2_15.tmp_0
21+
- teacher_reshape2_17.tmp_0 #
22+
- reshape2_17.tmp_0
23+
- teacher_reshape2_19.tmp_0 #
24+
- reshape2_19.tmp_0
25+
- teacher_reshape2_21.tmp_0 #
26+
- reshape2_21.tmp_0
27+
- teacher_depthwise_conv2d_14.tmp_0 # block2
28+
- depthwise_conv2d_14.tmp_0
29+
- teacher_depthwise_conv2d_15.tmp_0
30+
- depthwise_conv2d_15.tmp_0
31+
- teacher_reshape2_23.tmp_0 #block1
32+
- reshape2_23.tmp_0
33+
- teacher_relu_30.tmp_0 # final_conv
34+
- relu_30.tmp_0
35+
- teacher_bilinear_interp_v2_1.tmp_0
36+
- bilinear_interp_v2_1.tmp_0
37+
merge_feed: true
38+
teacher_model_dir: ./inference_model
39+
teacher_model_filename: inference.pdmodel
40+
teacher_params_filename: inference.pdiparams
41+
Quantization:
42+
activation_bits: 8
43+
is_full_quantize: false
44+
not_quant_pattern:
45+
- skip_quant
46+
quantize_op_types:
47+
- conv2d
48+
- depthwise_conv2d
49+
weight_bits: 8
50+
TrainConfig:
51+
epochs: 1
52+
eval_iter: 400
53+
learning_rate: 0.0005
54+
optimizer: SGD
55+
optim_args:
56+
weight_decay: 4.0e-05
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
Distillation:
2+
distill_lambda: 1.0
3+
distill_loss: l2_loss
4+
distill_node_pair:
5+
- teacher_reshape2_1.tmp_0
6+
- reshape2_1.tmp_0
7+
- teacher_reshape2_3.tmp_0
8+
- reshape2_3.tmp_0
9+
- teacher_reshape2_5.tmp_0
10+
- reshape2_5.tmp_0
11+
- teacher_reshape2_7.tmp_0 #block1
12+
- reshape2_7.tmp_0
13+
- teacher_reshape2_9.tmp_0
14+
- reshape2_9.tmp_0
15+
- teacher_reshape2_11.tmp_0
16+
- reshape2_11.tmp_0
17+
- teacher_reshape2_13.tmp_0
18+
- reshape2_13.tmp_0
19+
- teacher_reshape2_15.tmp_0
20+
- reshape2_15.tmp_0
21+
- teacher_reshape2_17.tmp_0
22+
- reshape2_17.tmp_0
23+
- teacher_reshape2_19.tmp_0
24+
- reshape2_19.tmp_0
25+
- teacher_reshape2_21.tmp_0
26+
- reshape2_21.tmp_0
27+
- teacher_depthwise_conv2d_14.tmp_0 # block2
28+
- depthwise_conv2d_14.tmp_0
29+
- teacher_depthwise_conv2d_15.tmp_0
30+
- depthwise_conv2d_15.tmp_0
31+
- teacher_reshape2_23.tmp_0 #block1
32+
- reshape2_23.tmp_0
33+
- teacher_relu_30.tmp_0 # final_conv
34+
- relu_30.tmp_0
35+
- teacher_bilinear_interp_v2_1.tmp_0
36+
- bilinear_interp_v2_1.tmp_0
37+
merge_feed: true
38+
teacher_model_dir: ./inference_model
39+
teacher_model_filename: inference.pdmodel
40+
teacher_params_filename: inference.pdiparams
41+
UnstructurePrune:
42+
prune_strategy: gmp
43+
prune_mode: ratio
44+
pruned_ratio: 0.75
45+
gmp_config:
46+
stable_iterations: 0
47+
pruning_iterations: 4500
48+
tunning_iterations: 4500
49+
resume_iteration: -1
50+
pruning_steps: 100
51+
initial_ratio: 0.15
52+
prune_params_type: conv1x1_only
53+
local_sparsity: True
54+
TrainConfig:
55+
epochs: 14
56+
eval_iter: 400
57+
learning_rate: 5.0e-03
58+
optim_args:
59+
weight_decay: 0.0005
60+
optimizer: SGD
61+

0 commit comments

Comments
 (0)