Skip to content

Commit 08f0633

Browse files
authored
add hierarchical text classification (#2501)
* add_hierarchical_classification * modify_wos_dataset * optimize_codes * add_paddle_serving * add_paddle_serving * add_paddle_serving
1 parent 3862947 commit 08f0633

File tree

26 files changed

+3629
-84
lines changed

26 files changed

+3629
-84
lines changed

applications/text_classification/hierarchical_classification/README.md

Lines changed: 424 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# 基于Paddle Serving的服务化部署
2+
3+
本文档将介绍如何使用[Paddle Serving](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署基于ERNIE 2.0的层次分类部署pipeline在线服务。
4+
5+
## 目录
6+
- [环境准备](#环境准备)
7+
- [模型转换](#模型转换)
8+
- [部署模型](#部署模型)
9+
10+
## 环境准备
11+
需要[准备PaddleNLP的运行环境]()和Paddle Serving的运行环境。
12+
13+
### 安装Paddle Serving
14+
安装指令如下,更多wheel包请参考[serving官网文档](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Latest_Packages_CN.md)
15+
```
16+
# 安装client和serving app,用于向服务发送请求
17+
pip install paddle_serving_app paddle_serving_client
18+
19+
# 安装serving,用于启动服务
20+
# CPU server
21+
pip install paddle_serving_server
22+
23+
# GPU server, 选择跟本地环境一致的命令:
24+
# CUDA10.2 + Cudnn7 + TensorRT6
25+
pip install paddle-serving-server-gpu==0.8.3.post102 -i https://pypi.tuna.tsinghua.edu.cn/simple
26+
# CUDA10.1 + TensorRT6
27+
pip install paddle-serving-server-gpu==0.8.3.post101 -i https://pypi.tuna.tsinghua.edu.cn/simple
28+
# CUDA11.2 + TensorRT8
29+
pip install paddle-serving-server-gpu==0.8.3.post112 -i https://pypi.tuna.tsinghua.edu.cn/simple
30+
```
31+
32+
默认开启国内清华镜像源来加速下载,如果您使用 HTTP 代理可以关闭(-i https://pypi.tuna.tsinghua.edu.cn/simple)
33+
34+
35+
### 安装FasterTokenizer文本处理加速库(可选)
36+
如果部署环境是Linux,推荐安装faster_tokenizer可以得到更极致的文本处理效率,进一步提升服务性能。目前暂不支持Windows设备安装,将会在下个版本支持。
37+
```
38+
pip install faster_tokenizer
39+
```
40+
41+
42+
## 模型转换
43+
44+
使用Paddle Serving做服务化部署时,需要将保存的inference模型转换为serving易于部署的模型。
45+
46+
用已安装的paddle_serving_client将静态图参数模型转换成serving格式。如何使用[静态图导出脚本](export_model.py)将训练后的模型转为静态图模型详见[模型静态图导出](../../README.md)
47+
48+
```bash
49+
# 模型地址--dirname根据实际填写即可
50+
python -m paddle_serving_client.convert --dirname ../../export --model_filename float32.pdmodel --params_filename float32.pdiparams
51+
52+
53+
# 可通过命令查参数含义
54+
python -m paddle_serving_client.convert --help
55+
```
56+
转换成功后的目录如下:
57+
```
58+
serving_server/
59+
├── float32.pdiparams
60+
├── float32.pdmodel
61+
├── serving_server_conf.prototxt
62+
└── serving_server_conf.stream.prototxt
63+
```
64+
65+
## 部署模型
66+
67+
serving目录包含启动pipeline服务和发送预测请求的代码和模型,包括:
68+
69+
```
70+
serving/
71+
├──serving_server
72+
│ ├── float32.pdiparams
73+
│ ├── float32.pdmodel
74+
│ ├── serving_server_conf.prototxt
75+
│ └── serving_server_conf.stream.prototxt
76+
├──config.yml # 层次分类任务启动服务端的配置文件
77+
├──rpc_client.py # 层次分类任务发送pipeline预测请求的脚本
78+
└──service.py # 层次分类任务启动服务端的脚本
79+
80+
```
81+
82+
### 修改配置文件
83+
目录中的`config.yml`文件解释了每一个参数的含义,可以根据实际需要修改其中的配置。比如:
84+
```
85+
# 修改模型目录为下载的模型目录或自己的模型目录:
86+
model_config: serving_server => model_config: erine-3.0-tiny/serving_server
87+
88+
# 修改rpc端口号为9998
89+
rpc_port: 9998 => rpc_port: 9998
90+
91+
# 修改使用GPU推理为使用CPU推理:
92+
device_type: 1 => device_type: 0
93+
```
94+
95+
### 分类任务
96+
#### 启动服务
97+
修改好配置文件后,执行下面命令启动服务:
98+
```
99+
python service.py
100+
```
101+
输出打印如下:
102+
```
103+
[DAG] Succ init
104+
[PipelineServicer] succ init
105+
......
106+
--- Running analysis [ir_graph_to_program_pass]
107+
I0624 06:31:00.891119 13138 analysis_predictor.cc:1007] ======= optimize end =======
108+
I0624 06:31:00.899907 13138 naive_executor.cc:102] --- skip [feed], feed -> token_type_ids
109+
I0624 06:31:00.899941 13138 naive_executor.cc:102] --- skip [feed], feed -> input_ids
110+
I0624 06:31:00.902855 13138 naive_executor.cc:102] --- skip [linear_147.tmp_1], fetch -> fetch
111+
[2022-06-24 06:31:01,899] [ WARNING] - Can't find the faster_tokenizers package, please ensure install faster_tokenizers correctly. You can install faster_tokenizers by `pip install faster_tokenizers`(Currently only work for linux platform).
112+
[2022-06-24 06:31:01,899] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load 'ernie-2.0-base-en'.
113+
[2022-06-24 06:31:01,899] [ INFO] - Already cached /root/.paddlenlp/models/ernie-2.0-base-en/vocab.txt
114+
[OP Object] init success
115+
```
116+
117+
#### 启动client测试
118+
注意执行客户端请求时关闭代理,并根据实际情况修改server_url地址(启动服务所在的机器)
119+
```
120+
python rpc_client.py
121+
```
122+
输出打印如下:
123+
```
124+
text: b'a high degree of uncertainty associated with the emission inventory for china tends to degrade the performance of chemical transport models in predicting pm2.5 concentrations especially on a daily basis. in this study a novel machine learning algorithm, geographically -weighted gradient boosting machine (gw-gbm), was developed by improving gbm through building spatial smoothing kernels to weigh the loss function. this modification addressed the spatial nonstationarity of the relationships between pm2.5 concentrations and predictor variables such as aerosol optical depth (aod) and meteorological conditions. gw-gbm also overcame the estimation bias of pm2.5 concentrations due to missing aod retrievals, and thus potentially improved subsequent exposure analyses. gw-gbm showed good performance in predicting daily pm2.5 concentrations (r-2 = 0.76, rmse = 23.0 g/m(3)) even with partially missing aod data, which was better than the original gbm model (r-2 = 0.71, rmse = 25.3 g/m(3)). on the basis of the continuous spatiotemporal prediction of pm2.5 concentrations, it was predicted that 95% of the population lived in areas where the estimated annual mean pm2.5 concentration was higher than 35 g/m(3), and 45% of the population was exposed to pm2.5 >75 g/m(3) for over 100 days in 2014. gw-gbm accurately predicted continuous daily pm2.5 concentrations in china for assessing acute human health effects. (c) 2017 elsevier ltd. all rights reserved.'
125+
label: 0,8
126+
--------------------
127+
...
128+
```
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
#rpc端口, rpc_port和http_port不允许同时为空。当rpc_port为空且http_port不为空时,会自动将rpc_port设置为http_port+1
2+
rpc_port: 18090
3+
4+
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port
5+
http_port: 9999
6+
7+
#worker_num, 最大并发数。
8+
#当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG
9+
#当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num
10+
worker_num: 1
11+
12+
#build_dag_each_worker, False,框架在进程内创建一条DAG;True,框架会每个进程内创建多个独立的DAG
13+
build_dag_each_worker: false
14+
15+
dag:
16+
#op资源类型, True, 为线程模型;False,为进程模型
17+
is_thread_op: False
18+
19+
#重试次数
20+
retry: 1
21+
22+
#使用性能分析, True,生成Timeline性能数据,对性能有一定影响;False为不使用
23+
use_profile: false
24+
tracer:
25+
interval_s: 10
26+
27+
op:
28+
seq_cls:
29+
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
30+
concurrency: 1
31+
32+
#当op配置没有server_endpoints时,从local_service_conf读取本地服务配置
33+
local_service_conf:
34+
#client类型,包括brpc, grpc和local_predictor.local_predictor不启动Serving服务,进程内预测
35+
client_type: local_predictor
36+
37+
#模型路径
38+
model_config: serving_server
39+
40+
#Fetch结果列表,以client_config中fetch_var的alias_name为准
41+
fetch_list: ["linear_147.tmp_1"]
42+
43+
# device_type, 0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
44+
device_type: 1
45+
46+
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
47+
devices: "3"
48+
49+
#use_mkldnn
50+
#use_mkldnn: True
51+
52+
#thread_num
53+
thread_num: 1
54+
55+
#ir_optim
56+
ir_optim: True
57+
58+
#开启tensorrt后,进行优化的子图包含的最少节点数
59+
#min_subgraph_size: 10
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
from paddle_serving_server.pipeline import PipelineClient
15+
from numpy import array, float32
16+
17+
import numpy as np
18+
19+
20+
class Runner(object):
21+
22+
def __init__(
23+
self,
24+
server_url: str,
25+
):
26+
self.client = PipelineClient()
27+
self.client.connect([server_url])
28+
29+
def Run(self, data):
30+
data = np.array([x.encode('utf-8') for x in data], dtype=np.object_)
31+
ret = self.client.predict(feed_dict={"sentence": data})
32+
for d, l, in zip(data, eval(ret.value[0])):
33+
print("text: ", d)
34+
print("label: ", l)
35+
print("--------------------")
36+
return
37+
38+
39+
if __name__ == "__main__":
40+
server_url = "127.0.0.1:18090"
41+
runner = Runner(server_url)
42+
texts = [
43+
"a high degree of uncertainty associated with the emission inventory for china tends to degrade the performance of chemical transport models in predicting pm2.5 concentrations especially on a daily basis. in this study a novel machine learning algorithm, geographically -weighted gradient boosting machine (gw-gbm), was developed by improving gbm through building spatial smoothing kernels to weigh the loss function. this modification addressed the spatial nonstationarity of the relationships between pm2.5 concentrations and predictor variables such as aerosol optical depth (aod) and meteorological conditions. gw-gbm also overcame the estimation bias of pm2.5 concentrations due to missing aod retrievals, and thus potentially improved subsequent exposure analyses. gw-gbm showed good performance in predicting daily pm2.5 concentrations (r-2 = 0.76, rmse = 23.0 g/m(3)) even with partially missing aod data, which was better than the original gbm model (r-2 = 0.71, rmse = 25.3 g/m(3)). on the basis of the continuous spatiotemporal prediction of pm2.5 concentrations, it was predicted that 95% of the population lived in areas where the estimated annual mean pm2.5 concentration was higher than 35 g/m(3), and 45% of the population was exposed to pm2.5 >75 g/m(3) for over 100 days in 2014. gw-gbm accurately predicted continuous daily pm2.5 concentrations in china for assessing acute human health effects. (c) 2017 elsevier ltd. all rights reserved.",
44+
"previous research exploring cognitive biases in bulimia nervosa suggests that attentional biases occur for both food-related and body-related cues. individuals with bulimia were compared to non-bulimic controls on an emotional-stroop task which contained both food-related and body-related cues. results indicated that bulimics (but not controls) demonstrated a cognitive bias for both food-related and body related cues. however, a discrepancy between the two cue-types was observed with body-related cognitive biases showing the most robust effects and food-related cognitive biases being the most strongly associated with the severity of the disorder. the results may have implications for clinical practice as bulimics with an increased cognitive bias for food-related cues indicated increased bulimic disorder severity. (c) 2016 elsevier ltd. all rights reserved.",
45+
"posterior reversible encephalopathy syndrome (pres) is a reversible clinical and neuroradiological syndrome which may appear at any age and characterized by headache, altered consciousness, seizures, and cortical blindness. the exact incidence is still unknown. the most commonly identified causes include hypertensive encephalopathy, eclampsia, and some cytotoxic drugs. vasogenic edema related subcortical white matter lesions, hyperintense on t2a and flair sequences, in a relatively symmetrical pattern especially in the occipital and parietal lobes can be detected on cranial mr imaging. these findings tend to resolve partially or completely with early diagnosis and appropriate treatment. here in, we present a rare case of unilateral pres developed following the treatment with pazopanib, a testicular tumor vascular endothelial growth factor (vegf) inhibitory agent."
46+
]
47+
runner.Run(texts)
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
from paddle_serving_server.web_service import WebService, Op
16+
17+
from numpy import array
18+
19+
import logging
20+
import numpy as np
21+
22+
_LOGGER = logging.getLogger()
23+
24+
25+
class Op(Op):
26+
27+
def init_op(self):
28+
from paddlenlp.transformers import AutoTokenizer
29+
self.tokenizer = AutoTokenizer.from_pretrained("ernie-2.0-base-en",
30+
use_faster=True)
31+
# Output nodes may differ from model to model
32+
# You can see the output node name in the conf.prototxt file of serving_server
33+
self.fetch_names = [
34+
"linear_147.tmp_1",
35+
]
36+
37+
def preprocess(self, input_dicts, data_id, log_id):
38+
# convert input format
39+
(_, input_dict), = input_dicts.items()
40+
data = input_dict["sentence"]
41+
if isinstance(data, str) and "array(" in data:
42+
data = eval(data)
43+
else:
44+
_LOGGER.error("input value {}is not supported.".format(data))
45+
data = [i.decode('utf-8') for i in data]
46+
47+
# tokenizer + pad
48+
data = self.tokenizer(data,
49+
max_length=512,
50+
padding=True,
51+
truncation=True)
52+
input_ids = data["input_ids"]
53+
token_type_ids = data["token_type_ids"]
54+
# print("input_ids:", input_ids)
55+
# print("token_type_ids", token_type_ids)
56+
return {
57+
"input_ids": np.array(input_ids, dtype="int64"),
58+
"token_type_ids": np.array(token_type_ids, dtype="int64")
59+
}, False, None, ""
60+
61+
def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
62+
63+
results = fetch_dict[self.fetch_names[0]]
64+
results = np.array(results)
65+
labels = []
66+
67+
for result in results:
68+
label = []
69+
result = 1 / (1 + (np.exp(-result)))
70+
for i, p in enumerate(result):
71+
if p > 0.5:
72+
label.append(str(i))
73+
labels.append(','.join(label))
74+
return {"label": labels}, None, ""
75+
76+
77+
class Service(WebService):
78+
79+
def get_pipeline_response(self, read_op):
80+
return Op(name="seq_cls", input_ops=[read_op])
81+
82+
83+
if __name__ == "__main__":
84+
service = Service(name="seq_cls")
85+
service.prepare_pipeline_config("config.yml")
86+
service.run_service()

0 commit comments

Comments
 (0)