Skip to content

Commit 021faa6

Browse files
authored
Doc update (#1159)
* modify transforner-rst * modify roformer tokenizer * delete modifications * add macbert * first update * add convbert and mpnet model * update * updat4e * add model number
1 parent 873a93a commit 021faa6

31 files changed

+238
-9
lines changed

docs/model_zoo/transformers.rst

Lines changed: 39 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ PaddleNLP为用户提供了常用的 ``BERT``、``ERNIE``、``ALBERT``、``RoBER
99
Transformer预训练模型汇总
1010
------------------------------------
1111

12-
下表汇总了介绍了目前PaddleNLP支持的各类预训练模型以及对应预训练权重。我们目前提供了 **83** 种预训练的参数权重供用户使用,
13-
其中包含了 **42** 种中文语言模型的预训练权重。
12+
下表汇总了介绍了目前PaddleNLP支持的各类预训练模型以及对应预训练权重。我们目前提供了**21**种网络结构, **91** 种预训练的参数权重供用户使用,
13+
其中包含了 **45** 种中文语言模型的预训练权重。
1414

1515
+--------------------+-----------------------------------------+--------------+-----------------------------------------+
1616
| Model | Pretrained Weight | Language | Details of the model |
@@ -124,6 +124,16 @@ Transformer预训练模型汇总
124124
| | | | and Traditional text using |
125125
| | | | Whole-Word-Masking with extented data. |
126126
| +-----------------------------------------+--------------+-----------------------------------------+
127+
| |``macbert-base-chinese`` | Chinese | 12-layer, 768-hidden, |
128+
| | | | 12-heads, 102M parameters. |
129+
| | | | Trained with novel MLM as correction |
130+
| | | | pre-training task. |
131+
| +-----------------------------------------+--------------+-----------------------------------------+
132+
| |``macbert-large-chinese`` | Chinese | 24-layer, 1024-hidden, |
133+
| | | | 16-heads, 326M parameters. |
134+
| | | | Trained with novel MLM as correction |
135+
| | | | pre-training task. |
136+
| +-----------------------------------------+--------------+-----------------------------------------+
127137
| |``simbert-base-chinese`` | Chinese | 12-layer, 768-hidden, |
128138
| | | | 12-heads, 108M parameters. |
129139
| | | | Trained on 22 million pairs of similar |
@@ -133,6 +143,18 @@ Transformer预训练模型汇总
133143
| | | | 12-heads, _M parameters. |
134144
| | | | Trained on lower-cased English text. |
135145
+--------------------+-----------------------------------------+--------------+-----------------------------------------+
146+
|ConvBert_ |``convbert-base`` | English | 12-layer, 768-hidden, |
147+
| | | | 12-heads, 106M parameters. |
148+
| | | | The ConvBERT base model. |
149+
| +-----------------------------------------+--------------+-----------------------------------------+
150+
| |``convbert-medium-small`` | English | 12-layer, 384-hidden, |
151+
| | | | 8-heads, 17M parameters. |
152+
| | | | The ConvBERT medium small model. |
153+
| +-----------------------------------------+--------------+-----------------------------------------+
154+
| |``convbert-small`` | English | 12-layer, 128-hidden, |
155+
| | | | 4-heads, 13M parameters. |
156+
| | | | The ConvBERT small model. |
157+
+--------------------+-----------------------------------------+--------------+-----------------------------------------+
136158
|DistilBert_ |``distilbert-base-uncased`` | English | 6-layer, 768-hidden, |
137159
| | | | 12-heads, 66M parameters. |
138160
| | | | The DistilBERT model distilled from |
@@ -221,6 +243,10 @@ Transformer预训练模型汇总
221243
| | | | 16-heads, 345M parameters. |
222244
| | | | Trained on English text. |
223245
+--------------------+-----------------------------------------+--------------+-----------------------------------------+
246+
|MPNet_ |``mpnet-base`` | English | 12-layer, 768-hidden, |
247+
| | | | 12-heads, 109M parameters. |
248+
| | | | MPNet Base Model. |
249+
+--------------------+-----------------------------------------+--------------+-----------------------------------------+
224250
|NeZha_ |``nezha-base-chinese`` | Chinese | 12-layer, 768-hidden, |
225251
| | | | 12-heads, 108M parameters. |
226252
| | | | Trained on Chinese text. |
@@ -396,6 +422,8 @@ Transformer预训练模型适用任务汇总
396422
+--------------------+-------------------------+----------------------+--------------------+-----------------+
397423
|BigBird_ |||||
398424
+--------------------+-------------------------+----------------------+--------------------+-----------------+
425+
|ConvBert_ |||||
426+
+--------------------+-------------------------+----------------------+--------------------+-----------------+
399427
|DistilBert_ |||||
400428
+--------------------+-------------------------+----------------------+--------------------+-----------------+
401429
|ELECTRA_ |||||
@@ -410,6 +438,8 @@ Transformer预训练模型适用任务汇总
410438
+--------------------+-------------------------+----------------------+--------------------+-----------------+
411439
|GPT_ |||||
412440
+--------------------+-------------------------+----------------------+--------------------+-----------------+
441+
|MPNet_ |||||
442+
+--------------------+-------------------------+----------------------+--------------------+-----------------+
413443
|NeZha_ |||||
414444
+--------------------+-------------------------+----------------------+--------------------+-----------------+
415445
|RoBERTa_ |||||
@@ -429,13 +459,15 @@ Transformer预训练模型适用任务汇总
429459
.. _BART: https://arxiv.org/abs/1910.13461
430460
.. _BERT: https://arxiv.org/abs/1810.04805
431461
.. _BigBird: https://arxiv.org/abs/2007.14062
462+
.. _ConvBert: https://arxiv.org/abs/2008.02496
432463
.. _DistilBert: https://arxiv.org/abs/1910.01108
433464
.. _ELECTRA: https://arxiv.org/abs/2003.10555
434465
.. _ERNIE: https://arxiv.org/abs/1904.09223
435466
.. _ERNIE-DOC: https://arxiv.org/abs/2012.15688
436467
.. _ERNIE-GEN: https://arxiv.org/abs/2001.11314
437468
.. _ERNIE-GRAM: https://arxiv.org/abs/2010.12148
438469
.. _GPT: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
470+
.. _MPNet: https://arxiv.org/abs/2004.09297
439471
.. _NeZha: https://arxiv.org/abs/1909.00204
440472
.. _RoBERTa: https://arxiv.org/abs/1907.11692
441473
.. _RoFormer: https://arxiv.org/abs/2104.09864
@@ -512,19 +544,23 @@ Reference
512544
`huawei-noah/Pretrained-Language-Model/NEZHA-PyTorch/ <https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/NEZHA-PyTorch>`_
513545
`ZhuiyiTechnology/simbert <https://github.com/ZhuiyiTechnology/simbert>`_
514546
- Lan, Zhenzhong, et al. "Albert: A lite bert for self-supervised learning of language representations." arXiv preprint arXiv:1909.11942 (2019).
547+
- Lewis, Mike, et al. "BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension." arXiv preprint arXiv:1910.13461 (2019).
515548
- Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
516549
- Zaheer, Manzil, et al. "Big bird: Transformers for longer sequences." arXiv preprint arXiv:2007.14062 (2020).
550+
- Jiang, Zihang, et al. "ConvBERT: Improving BERT with Span-based Dynamic Convolution." arXiv preprint arXiv:2008.02496 (2020).
517551
- Sanh, Victor, et al. "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter." arXiv preprint arXiv:1910.01108 (2019).
518552
- Clark, Kevin, et al. "Electra: Pre-training text encoders as discriminators rather than generators." arXiv preprint arXiv:2003.10555 (2020).
519553
- Sun, Yu, et al. "Ernie: Enhanced representation through knowledge integration." arXiv preprint arXiv:1904.09223 (2019).
520554
- Xiao, Dongling, et al. "Ernie-gen: An enhanced multi-flow pre-training and fine-tuning framework for natural language generation." arXiv preprint arXiv:2001.11314 (2020).
521555
- Xiao, Dongling, et al. "ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding." arXiv preprint arXiv:2010.12148 (2020).
522556
- Radford, Alec, et al. "Language models are unsupervised multitask learners." OpenAI blog 1.8 (2019): 9.
557+
- Song, Kaitao, et al. "MPNet: Masked and Permuted Pre-training for Language Understanding." arXiv preprint arXiv:2004.09297 (2020).
523558
- Wei, Junqiu, et al. "NEZHA: Neural contextualized representation for chinese language understanding." arXiv preprint arXiv:1909.00204 (2019).
524559
- Liu, Yinhan, et al. "Roberta: A robustly optimized bert pretraining approach." arXiv preprint arXiv:1907.11692 (2019).
560+
- Su Jianlin, et al. "RoFormer: Enhanced Transformer with Rotary Position Embedding." arXiv preprint arXiv:2104.09864 (2021).
525561
- Tian, Hao, et al. "SKEP: Sentiment knowledge enhanced pre-training for sentiment analysis." arXiv preprint arXiv:2005.05635 (2020).
526562
- Vaswani, Ashish, et al. "Attention is all you need." arXiv preprint arXiv:1706.03762 (2017).
527563
- Jiao, Xiaoqi, et al. "Tinybert: Distilling bert for natural language understanding." arXiv preprint arXiv:1909.10351 (2019).
528564
- Bao, Siqi, et al. "Plato-2: Towards building an open-domain chatbot via curriculum learning." arXiv preprint arXiv:2006.16779 (2020).
529565
- Yang, Zhilin, et al. "Xlnet: Generalized autoregressive pretraining for language understanding." arXiv preprint arXiv:1906.08237 (2019).
530-
- Cui, Yiming, et al. "Pre-training with whole word masking for chinese bert." arXiv preprint arXiv:1906.08101 (2019).
566+
- Cui, Yiming, et al. "Pre-training with whole word masking for chinese bert." arXiv preprint arXiv:1906.08101 (2019).

docs/source/paddlenlp.data.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@ paddlenlp.data
1111
:maxdepth: 4
1212

1313
paddlenlp.data.collate
14-
paddlenlp.data.iterator
1514
paddlenlp.data.sampler
1615
paddlenlp.data.tokenizer
1716
paddlenlp.data.vocab

docs/source/paddlenlp.embeddings.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,5 +10,4 @@ paddlenlp.embeddings
1010
.. toctree::
1111
:maxdepth: 4
1212

13-
paddlenlp.embeddings.constant
1413
paddlenlp.embeddings.token_embedding

docs/source/paddlenlp.metrics.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,5 +17,6 @@ paddlenlp.metrics
1717
paddlenlp.metrics.glue
1818
paddlenlp.metrics.perplexity
1919
paddlenlp.metrics.rouge
20+
paddlenlp.metrics.sighan
2021
paddlenlp.metrics.squad
2122
paddlenlp.metrics.utils
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
sighan
2+
===============================
3+
4+
.. automodule:: paddlenlp.metrics.sighan
5+
:members:
6+
:no-undoc-members:
7+
:show-inheritance:
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
encoder
2+
============================================================
3+
4+
.. automodule:: paddlenlp.ops.faster_transformer.transformer.encoder
5+
:members:
6+
:no-undoc-members:
7+
:show-inheritance:

docs/source/paddlenlp.ops.faster_transformer.transformer.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,5 @@ transformer
1212

1313
paddlenlp.ops.faster_transformer.transformer.decoder
1414
paddlenlp.ops.faster_transformer.transformer.decoding
15+
paddlenlp.ops.faster_transformer.transformer.encoder
1516
paddlenlp.ops.faster_transformer.transformer.faster_transformer

docs/source/paddlenlp.ops.optimizer.rst

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,4 @@ optimizer
1010
.. toctree::
1111
:maxdepth: 4
1212

13-
paddlenlp.ops.optimizer.AdamwOptimizer
14-
paddlenlp.ops.optimizer.adamw
1513
paddlenlp.ops.optimizer.adamwdl
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
dependency\_parsing
2+
=============================================
3+
4+
.. automodule:: paddlenlp.taskflow.dependency_parsing
5+
:members:
6+
:no-undoc-members:
7+
:show-inheritance:
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
knowledge\_mining
2+
===========================================
3+
4+
.. automodule:: paddlenlp.taskflow.knowledge_mining
5+
:members:
6+
:no-undoc-members:
7+
:show-inheritance:

0 commit comments

Comments
 (0)