|
1 | | -[](https://pypi.org/project/ltp/) |
2 | | - |
3 | | - |
4 | | - |
5 | | - |
6 | | -[](https://ltp.readthedocs.io/zh_CN/latest/?badge=latest) |
7 | | -[](https://pypi.python.org/pypi/ltp) |
8 | | - |
9 | | -# LTP 4 |
10 | | - |
11 | | -LTP(Language Technology Platform) 提供了一系列中文自然语言处理工具,用户可以使用这些工具对于中文文本进行分词、词性标注、句法分析等等工作。 |
12 | | - |
13 | | -If you use any source codes included in this toolkit in your work, please kindly cite the following paper. The bibtex |
14 | | -are listed below: |
15 | | -<pre> |
16 | | -@article{che2020n, |
17 | | - title={N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models}, |
18 | | - author={Che, Wanxiang and Feng, Yunlong and Qin, Libo and Liu, Ting}, |
19 | | - journal={arXiv preprint arXiv:2009.11616}, |
20 | | - year={2020} |
21 | | -} |
22 | | -</pre> |
23 | | - |
24 | | -## 快速使用 |
25 | | - |
26 | | -```python |
27 | | -from ltp import LTP |
28 | | - |
29 | | -ltp = LTP() # 默认加载 Small 模型 |
30 | | -seg, hidden = ltp.seg(["他叫汤姆去拿外衣。"]) |
31 | | -pos = ltp.pos(hidden) |
32 | | -ner = ltp.ner(hidden) |
33 | | -srl = ltp.srl(hidden) |
34 | | -dep = ltp.dep(hidden) |
35 | | -sdp = ltp.sdp(hidden) |
36 | | -``` |
37 | | - |
38 | | -**[详细说明](docs/quickstart.rst)** |
39 | | - |
40 | | -## Language Bindings |
41 | | - |
42 | | -+ C++ |
43 | | -+ Rust |
44 | | -+ Java |
45 | | -+ Python Rebinding |
46 | | - |
47 | | -[libltp](https://github.com/HIT-SCIR/libltp) |
48 | | - |
49 | | -## 指标 |
50 | | - |
51 | | -| 模型 | 分词 | 词性 | 命名实体 | 语义角色 | 依存句法 | 语义依存 | 速度(句/S) | |
52 | | -| :-------------: | :---: | :---: | :------: | :------: | :------: | :------: | :--------: | |
53 | | -| LTP 4.0 (Base) | 98.7 | 98.5 | 95.4 | 80.6 | 89.5 | 75.2 | | |
54 | | -| LTP 4.0 (Small) | 98.4 | 98.2 | 94.3 | 78.4 | 88.3 | 74.7 | 12.58 | |
55 | | -| LTP 4.0 (Tiny) | 96.8 | 97.1 | 91.6 | 70.9 | 83.8 | 70.1 | 29.53 | |
56 | | - |
57 | | -**[模型下载地址](MODELS.md)** |
58 | | - |
59 | | -## 模型算法 |
60 | | - |
61 | | -+ 分词: Electra Small<sup>[1](#RELTRANS)</sup> + Linear |
62 | | -+ 词性: Electra Small + Linear |
63 | | -+ 命名实体: Electra Small + Relative Transformer<sup>[2](#RELTRANS)</sup> + Linear |
64 | | -+ 依存句法: Electra Small + BiAffine + Eisner<sup>[3](#Eisner)</sup> |
65 | | -+ 语义依存: Electra Small + BiAffine |
66 | | -+ 语义角色: Electra Small + BiAffine + CRF |
67 | | - |
68 | | -## 构建 Wheel 包 |
69 | | - |
70 | | -```shell script |
71 | | -python setup.py sdist bdist_wheel |
72 | | -python -m twine upload dist/* |
73 | | -``` |
74 | | - |
75 | | -## 作者信息 |
76 | | - |
77 | | -+ 冯云龙 <<[ylfeng@ir.hit.edu.cn](mailto:ylfeng@ir.hit.edu.cn)>> |
78 | | - |
79 | | -## 开源协议 |
80 | | - |
81 | | -1. 语言技术平台面向国内外大学、中科院各研究所以及个人研究者免费开放源代码,但如上述机构和个人将该平台用于商业目的(如企业合作项目等)则需要付费。 |
82 | | -2. 除上述机构以外的企事业单位,如申请使用该平台,需付费。 |
83 | | -3. 凡涉及付费问题,请发邮件到 car@ir.hit.edu.cn 洽商。 |
84 | | -4. 如果您在 LTP 基础上发表论文或取得科研成果,请您在发表论文和申报成果时声明“使用了哈工大社会计算与信息检索研究中心研制的语言技术平台(LTP)”. |
85 | | - 同时,发信给car@ir.hit.edu.cn,说明发表论文或申报成果的题目、出处等。 |
86 | | - |
87 | | -## 脚注 |
88 | | - |
89 | | -+ <a name="RELTRANS">1</a>:: [Chinese-ELECTRA](https://github.com/ymcui/Chinese-ELECTRA) |
90 | | -+ <a name="RELTRANS"> |
91 | | - 2</a>:: [TENER: Adapting Transformer Encoder for Named Entity Recognition](https://arxiv.org/abs/1911.04474) |
92 | | -+ <a name="Eisner"> |
93 | | - 3</a>:: [A PyTorch implementation of "Deep Biaffine Attention for Neural Dependency Parsing"](https://github.com/yzhangcs/parser) |
| 1 | +[](https://pypi.org/project/ltp/) |
| 2 | + |
| 3 | + |
| 4 | + |
| 5 | + |
| 6 | +[](https://ltp.readthedocs.io/zh_CN/latest/?badge=latest) |
| 7 | +[](https://pypi.python.org/pypi/ltp) |
| 8 | + |
| 9 | +# LTP 4 |
| 10 | + |
| 11 | +LTP(Language Technology Platform) 提供了一系列中文自然语言处理工具,用户可以使用这些工具对于中文文本进行分词、词性标注、句法分析等等工作。 |
| 12 | + |
| 13 | +If you use any source codes included in this toolkit in your work, please kindly cite the following paper. The bibtex |
| 14 | +are listed below: |
| 15 | +<pre> |
| 16 | +@article{che2020n, |
| 17 | + title={N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models}, |
| 18 | + author={Che, Wanxiang and Feng, Yunlong and Qin, Libo and Liu, Ting}, |
| 19 | + journal={arXiv preprint arXiv:2009.11616}, |
| 20 | + year={2020} |
| 21 | +} |
| 22 | +</pre> |
| 23 | + |
| 24 | +## 快速使用 |
| 25 | + |
| 26 | +```python |
| 27 | +from ltp import LTP |
| 28 | + |
| 29 | +ltp = LTP() # 默认加载 Small 模型 |
| 30 | +seg, hidden = ltp.seg(["他叫汤姆去拿外衣。"]) |
| 31 | +pos = ltp.pos(hidden) |
| 32 | +ner = ltp.ner(hidden) |
| 33 | +srl = ltp.srl(hidden) |
| 34 | +dep = ltp.dep(hidden) |
| 35 | +sdp = ltp.sdp(hidden) |
| 36 | +``` |
| 37 | + |
| 38 | +**[详细说明](docs/quickstart.rst)** |
| 39 | + |
| 40 | +## Language Bindings |
| 41 | + |
| 42 | ++ C++ |
| 43 | ++ Rust |
| 44 | ++ Java |
| 45 | ++ Python Rebinding |
| 46 | + |
| 47 | +[libltp](https://github.com/HIT-SCIR/libltp) |
| 48 | + |
| 49 | +## 指标 |
| 50 | + |
| 51 | +| 模型 | 分词 | 词性 | 命名实体 | 语义角色 | 依存句法 | 语义依存 | 速度(句/S) | |
| 52 | +| :--------------: | :---: | :---: | :------: | :------: | :------: | :------: | :--------: | |
| 53 | +| LTP 4.0 (Base) | 98.7 | 98.5 | 95.4 | 80.6 | 89.5 | 75.2 | 39.12 | |
| 54 | +| LTP 4.0 (Base1) | 99.22 | 98.73 | 96.39 | 79.28 | 89.57 | 76.57 | --.-- | |
| 55 | +| LTP 4.0 (Base2) | 99.18 | 98.69 | 95.97 | 79.49 | 90.19 | 76.62 | --.-- | |
| 56 | +| LTP 4.0 (Small) | 98.4 | 98.2 | 94.3 | 78.4 | 88.3 | 74.7 | 43.13 | |
| 57 | +| LTP 4.0 (Tiny) | 96.8 | 97.1 | 91.6 | 70.9 | 83.8 | 70.1 | 53.22 | |
| 58 | + |
| 59 | +**[模型下载地址](MODELS.md)** |
| 60 | + |
| 61 | +## 模型算法 |
| 62 | + |
| 63 | ++ 分词: Electra Small<sup>[1](#RELTRANS)</sup> + Linear |
| 64 | ++ 词性: Electra Small + Linear |
| 65 | ++ 命名实体: Electra Small + Relative Transformer<sup>[2](#RELTRANS)</sup> + Linear |
| 66 | ++ 依存句法: Electra Small + BiAffine + Eisner<sup>[3](#Eisner)</sup> |
| 67 | ++ 语义依存: Electra Small + BiAffine |
| 68 | ++ 语义角色: Electra Small + BiAffine + CRF |
| 69 | + |
| 70 | +## 构建 Wheel 包 |
| 71 | + |
| 72 | +```shell script |
| 73 | +python setup.py sdist bdist_wheel |
| 74 | +python -m twine upload dist/* |
| 75 | +``` |
| 76 | + |
| 77 | +## 作者信息 |
| 78 | + |
| 79 | ++ 冯云龙 <<[ylfeng@ir.hit.edu.cn](mailto:ylfeng@ir.hit.edu.cn)>> |
| 80 | + |
| 81 | +## 开源协议 |
| 82 | + |
| 83 | +1. 语言技术平台面向国内外大学、中科院各研究所以及个人研究者免费开放源代码,但如上述机构和个人将该平台用于商业目的(如企业合作项目等)则需要付费。 |
| 84 | +2. 除上述机构以外的企事业单位,如申请使用该平台,需付费。 |
| 85 | +3. 凡涉及付费问题,请发邮件到 car@ir.hit.edu.cn 洽商。 |
| 86 | +4. 如果您在 LTP 基础上发表论文或取得科研成果,请您在发表论文和申报成果时声明“使用了哈工大社会计算与信息检索研究中心研制的语言技术平台(LTP)”. |
| 87 | + 同时,发信给car@ir.hit.edu.cn,说明发表论文或申报成果的题目、出处等。 |
| 88 | + |
| 89 | +## 脚注 |
| 90 | + |
| 91 | ++ <a name="RELTRANS">1</a>:: [Chinese-ELECTRA](https://github.com/ymcui/Chinese-ELECTRA) |
| 92 | ++ <a name="RELTRANS"> |
| 93 | + 2</a>:: [TENER: Adapting Transformer Encoder for Named Entity Recognition](https://arxiv.org/abs/1911.04474) |
| 94 | ++ <a name="Eisner"> |
| 95 | + 3</a>:: [A PyTorch implementation of "Deep Biaffine Attention for Neural Dependency Parsing"](https://github.com/yzhangcs/parser) |
0 commit comments