Skip to content

Commit 5af122b

Browse files
27182812yingyibiao
andauthored
Chinese bert (#1429)
* add docs * add docs * Update tokenizer.py try import pypinyin * update pypinyin Co-authored-by: yingyibiao <[email protected]>
1 parent beed3da commit 5af122b

File tree

3 files changed

+295
-8
lines changed

3 files changed

+295
-8
lines changed

paddlenlp/transformers/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,3 +86,5 @@
8686
from .reformer.tokenizer import *
8787
from .mobilebert.modeling import *
8888
from .mobilebert.tokenizer import *
89+
from .chinesebert.modeling import *
90+
from .chinesebert.tokenizer import *

paddlenlp/transformers/chinesebert/modeling.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,6 @@
4242
from paddlenlp.transformers import PretrainedModel, register_base_model
4343
from paddlenlp.transformers.bert.modeling import BertPooler, BertPretrainingHeads
4444

45-
# from .fusion_embedding import FusionBertEmbeddings
46-
4745
__all__ = [
4846
"ChineseBertModel",
4947
"ChineseBertPretrainedModel",
@@ -55,7 +53,6 @@
5553
]
5654

5755

58-
# fusion_embedding.py
5956
class PinyinEmbedding(nn.Layer):
6057
def __init__(self,
6158
pinyin_map_len: int,
@@ -65,6 +62,7 @@ def __init__(self,
6562
Pinyin Embedding Layer.
6663
6764
Args:
65+
pinyin_map_len (int): the size of pinyin map, which about 26 Romanian characters and 6 numbers.
6866
embedding_size (int): the size of each embedding vector.
6967
pinyin_out_dim (int): kernel number of conv.
7068

0 commit comments

Comments
 (0)