Skip to content

Commit 1f4a507

Browse files
authored
Update README.md (#1858)
* Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md
1 parent d0d2067 commit 1f4a507

File tree

1 file changed

+41
-0
lines changed

1 file changed

+41
-0
lines changed

docs/model_zoo/taskflow.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -678,6 +678,31 @@ my_ner = Taskflow("ner", mode="accurate", task_path="./custom_task_path/")
678678
```
679679
</div></details>
680680

681+
## 模型算法
682+
683+
<details><summary>模型算法说明</summary><div>
684+
685+
<table>
686+
<tr><td>任务名称<td>模型<td>模型详情<td>训练集
687+
<tr><td rowspan="3">中文分词<td>默认模式: BiGRU+CRF<td> <a href="https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/lexical_analysis"> 训练详情 <td> 百度自建数据集,包含近2200万句子,覆盖多种场景
688+
<tr><td>快速模式:Jieba<td> - <td> -
689+
<tr><td>精确模式:WordTag<td> <a href="https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/text_to_knowledge/ernie-ctm"> 训练详情 <td> 百度自建数据集,词类体系基于TermTree构建
690+
<tr><td>词性标注<td>BiGRU+CRF<td> <a href="https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/lexical_analysis"> 训练详情 <td> 百度自建数据集,包含2200万句子,覆盖多种场景
691+
<tr><td rowspan="2">命名实体识别<td>精确模式:WordTag<td> <a href="https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/text_to_knowledge/ernie-ctm"> 训练详情 <td> 百度自建数据集,词类体系基于TermTree构建
692+
<tr><td>快速模式:BiGRU+CRF <td> <a href="https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/lexical_analysis"> 训练详情 <td> 百度自建数据集,包含2200万句子,覆盖多种场景
693+
<tr><td>依存句法分析<td>DDParser<td> <a href="https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/dependency_parsing/ddparser"> 训练详情 <td> 百度自建数据集,DuCTB 1.0中文依存句法树库
694+
<tr><td rowspan="2">解语知识标注<td>词类知识标注:WordTag<td> <a href="https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/text_to_knowledge/ernie-ctm"> 训练详情 <td> 百度自建数据集,词类体系基于TermTree构建
695+
<tr><td>名词短语标注:NPTag <td> <a href="https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/text_to_knowledge/nptag"> 训练详情 <td> 百度自建数据集
696+
<tr><td>文本纠错<td>ERNIE-CSC<td> <a href="https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/text_correction/ernie-csc"> 训练详情 <td> SIGHAN简体版数据集及 <a href="https://github.com/wdimmy/Automatic-Corpus-Generation/blob/master/corpus/train.sgml"> Automatic Corpus Generation生成的中文纠错数据集
697+
<tr><td>文本相似度<td>SimBERT<td> - <td> 收集百度知道2200万对相似句组
698+
<tr><td rowspan="2">情感倾向分析<td> BiLSTM <td> - <td> 百度自建数据集
699+
<tr><td> SKEP <td> <a href="https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/sentiment_analysis/skep"> 训练详情 <td> 百度自建数据集
700+
<tr><td>生成式问答<td>CPM<td> - <td> 100GB级别中文数据
701+
<tr><td>智能写诗<td>CPM<td> - <td> 100GB级别中文数据
702+
<tr><td>开放域对话<td>PLATO-Mini<td> - <td> 十亿级别中文对话数据
703+
</table>
704+
705+
</div></details>
681706

682707
## FAQ
683708

@@ -699,6 +724,22 @@ ner = Taskflow("ner", home_path="/workspace")
699724

700725
</div></details>
701726

727+
<details><summary><b>Q:</b>Taskflow如何提升预测速度?</summary><div>
728+
729+
**A:** 可以结合设备情况适当调整batch_size,采用批量输入的方式来提升平均速率。示例:
730+
```python
731+
from paddlenlp import Taskflow
732+
733+
# 精确模式模型体积较大,可结合机器情况适当调整batch_size,采用批量样本输入的方式。
734+
seg_accurate = Taskflow("word_segmentation", mode="accurate", batch_size=32)
735+
736+
# 批量样本输入,输入为多个句子组成的list,预测速度更快
737+
texts = ["热梅茶是一道以梅子为主要原料制作的茶饮", "《孤女》是2010年九州出版社出版的小说,作者是余兼羽"]
738+
seg_accurate(texts)
739+
```
740+
通过上述方式进行分词可以大幅提升预测速度。
741+
742+
</div></details>
702743

703744
<details><summary><b>Q:</b>后续会增加更多任务支持吗?</summary><div>
704745

0 commit comments

Comments
 (0)