Skip to content

Commit ec3c633

Browse files
authored
[Pretrained weight] Change ernie-1.0 to ernie-1.0-base-zh (#2224)
* ernie-1.0 -> ernie-1.0-base-zh * change name of some readme.
1 parent 953d178 commit ec3c633

File tree

22 files changed

+64
-41
lines changed

22 files changed

+64
-41
lines changed

applications/neural_search/recall/domain_adaptive_pretraining/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ python -u -m paddle.distributed.launch \
105105
--log_dir "output/$task_name/log" \
106106
run_pretrain_static.py \
107107
--model_type "ernie" \
108-
--model_name_or_path "ernie-1.0" \
108+
--model_name_or_path "ernie-1.0-base-zh" \
109109
--input_dir "./data" \
110110
--output_dir "output/$task_name" \
111111
--max_seq_len 512 \

applications/neural_search/recall/domain_adaptive_pretraining/scripts/run_pretrain_static.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ PYTHONPATH=../../../ python -u -m paddle.distributed.launch \
88
--log_dir "output/$task_name/log" \
99
run_pretrain_static.py \
1010
--model_type "ernie" \
11-
--model_name_or_path "ernie-1.0" \
11+
--model_name_or_path "ernie-1.0-base-zh" \
1212
--input_dir "./data" \
1313
--output_dir "output/$task_name" \
1414
--max_seq_len 512 \

applications/neural_search/recall/simcse/scripts/train.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ python -u -m paddle.distributed.launch --gpus '0,1,2,3' \
1414
--output_emb_size 256 \
1515
--train_set_file "./recall/train_unsupervised.csv" \
1616
--test_set_file "./recall/dev.csv"
17-
--model_name_or_path "ernie-1.0"
17+
--model_name_or_path "ernie-1.0-base-zh"
1818

1919
# simcse cpu
2020
# python train.py \
@@ -31,7 +31,7 @@ python -u -m paddle.distributed.launch --gpus '0,1,2,3' \
3131
# --output_emb_size 256 \
3232
# --train_set_file "./recall/train_unsupervised.csv" \
3333
# --test_set_file "./recall/dev.csv"
34-
# --model_name_or_path "ernie-1.0"
34+
# --model_name_or_path "ernie-1.0-base-zh"
3535

3636
# post training + simcse
3737
# python -u -m paddle.distributed.launch --gpus '0,1,2,3' \

docs/get_started/quick_start.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525

2626
.. code-block::
2727
28-
>>> MODEL_NAME = "ernie-1.0"
28+
>>> MODEL_NAME = "ernie-1.0-base-zh"
2929
>>> ernie_model = paddlenlp.transformers.ErnieModel.from_pretrained(MODEL_NAME)
3030
3131
加载预训练模型ERNIE用于文本分类任务的Fine-tune网络,只需指定想要使用的模型名称和文本分类的类别数即可完成网络定义。

docs/model_zoo/transformers/ERNIE/contents.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ERNIE模型汇总
1212
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
1313
| Pretrained Weight | Language | Details of the model |
1414
+==================================================================================+==============+==================================================================================+
15-
|``ernie-1.0`` | Chinese | 12-layer, 768-hidden, |
15+
|``ernie-1.0-base-zh`` | Chinese | 12-layer, 768-hidden, |
1616
| | | 12-heads, 108M parameters. |
1717
| | | Trained on Chinese text. |
1818
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+

docs/model_zoo/transformers/all/transformers.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -289,7 +289,7 @@ Transformer预训练模型汇总
289289
| | | | 1-heads, 3M parameters. |
290290
| | | | Trained on Chinese legal corpus. |
291291
+--------------------+----------------------------------------------------------------------------------+--------------+-----------------------------------------+
292-
|ERNIE_ |``ernie-1.0`` | Chinese | 12-layer, 768-hidden, |
292+
|ERNIE_ |``ernie-1.0-base-zh`` | Chinese | 12-layer, 768-hidden, |
293293
| | | | 12-heads, 108M parameters. |
294294
| | | | Trained on Chinese text. |
295295
| +----------------------------------------------------------------------------------+--------------+-----------------------------------------+

docs/trainer.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,8 @@ parser = PdArgumentParser(TrainingArguments, DataArguments)
5252
- 这里的,`labels`如果模型没有使用到,我们还需要额外定义`criterion`,计算最后的loss损失。
5353
```python
5454
train_dataset = load_dataset("chnsenticorp", splits=["train"])
55-
model = AutoModelForSequenceClassification.from_pretrained("ernie-1.0", num_classes=len(train_dataset.label_list))
56-
tokenizer = AutoTokenizer.from_pretrained("ernie-1.0")
55+
model = AutoModelForSequenceClassification.from_pretrained("ernie-1.0-base-zh", num_classes=len(train_dataset.label_list))
56+
tokenizer = AutoTokenizer.from_pretrained("ernie-1.0-base-zh")
5757

5858
def convert_example(example, tokenizer):
5959
encoded_inputs = tokenizer(text=example["text"], max_seq_len=128, pad_to_max_seq_len=True)

examples/few_shot/pet/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ python -u -m paddle.distributed.launch --gpus "0" \
3838
--learning_rate 1E-4 \
3939
--epochs 10 \
4040
--max_seq_length 512 \
41-
--language_model "ernie-1.0" \
41+
--language_model "ernie-1.0-base-zh" \
4242
--rdrop_coef 0 \
4343
```
4444
参数含义说明

examples/information_extraction/DuEE/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ test_ds = DuEventExtraction(args.test_data, args.vocab_path, args.tag_path)
9191
```python
9292
from paddlenlp.transformers import ErnieForTokenClassification
9393

94-
model = ErnieForTokenClassification.from_pretrained("ernie-1.0", num_classes=len(label_map))
94+
model = ErnieForTokenClassification.from_pretrained("ernie-1.0-base-zh", num_classes=len(label_map))
9595
```
9696

9797
同时,对于枚举分类数据采用的是基于ERNIE的文本分类模型,枚举角色类型为环节。模型原理图如下:
@@ -106,7 +106,7 @@ model = ErnieForTokenClassification.from_pretrained("ernie-1.0", num_classes=len
106106
**同样地,PaddleNLP提供了ERNIE预训练模型常用文本分类模型,可以通过指定模型名字完成一键加载**
107107

108108
```python
109-
model = ErnieForSequenceClassification.from_pretrained("ernie-1.0", num_classes=len(label_map))
109+
model = ErnieForSequenceClassification.from_pretrained("ernie-1.0-base-zh", num_classes=len(label_map))
110110
```
111111

112112
### 快速复现基线Step3:数据处理
@@ -117,7 +117,7 @@ model = ErnieForSequenceClassification.from_pretrained("ernie-1.0", num_classes=
117117
```python
118118
from paddlenlp.transformers import ErnieTokenizer
119119

120-
tokenizer = ErnieTokenizer.from_pretrained("ernie-1.0")
120+
tokenizer = ErnieTokenizer.from_pretrained("ernie-1.0-base-zh")
121121
```
122122

123123
文本数据处理直接调用tokenizer即可输出模型所需输入数据。

examples/information_extraction/DuIE/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,8 +51,8 @@ F1 = (2 * P * R) / (P + R),其中
5151
```python
5252
from paddlenlp.transformers import ErnieForTokenClassification, ErnieTokenizer
5353

54-
model = ErnieForTokenClassification.from_pretrained("ernie-1.0", num_classes=(len(label_map) - 2) * 2 + 2)
55-
tokenizer = ErnieTokenizer.from_pretrained("ernie-1.0")
54+
model = ErnieForTokenClassification.from_pretrained("ernie-1.0-base-zh", num_classes=(len(label_map) - 2) * 2 + 2)
55+
tokenizer = ErnieTokenizer.from_pretrained("ernie-1.0-base-zh")
5656
```
5757

5858
文本数据处理直接调用tokenizer即可输出模型所需输入数据。

0 commit comments

Comments
 (0)