Skip to content

Commit 4e59ce0

Browse files
authored
Fix docs and constrains for FasterGeneration (#1471)
* update perf * fix doc and constrains for FasterGeneration * update readme
1 parent f580332 commit 4e59ce0

File tree

8 files changed

+47
-9
lines changed

8 files changed

+47
-9
lines changed

README.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,8 @@
1212
![GitHub](https://img.shields.io/github/license/paddlepaddle/paddlenlp)
1313

1414
## News <img src="./docs/imgs/news_icon.png" width="40"/>
15-
* [2021-10-12] PaddleNLP 2.1版本已发布!新增开箱即用的NLP任务能力、Prompt Tuning应用示例与生成任务的高性能推理!:tada:更多详细升级信息请查看[Release Note](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.1.0)
16-
* [2021-09-16][《千言-问题匹配鲁棒性评测》](https://www.datafountain.cn/competitions/516)正式开赛啦🔥🔥🔥,欢迎大家踊跃报名!! [官方基线地址](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/text_matching/question_matching)
17-
* [2021-08-22][《千言:面向事实一致性的生成评测比赛》](https://aistudio.baidu.com/aistudio/competition/detail/105)正式开赛啦🔥🔥🔥,欢迎大家踊跃报名!! [官方基线地址](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/text_generation/unimo-text)
15+
* [2021-12-12] PaddleNLP 2.2版本已发布!新增预训练加速训推一体开发FasterERNIE、面向生成任务的高性能加速组件FasterGeneration正式推出!:tada:更多详细升级信息请查看[Release Note](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.1.0)
16+
* [2021-10-12] PaddleNLP 2.1版本已发布!新增开箱即用的NLP任务能力、Prompt Tuning应用示例与生成任务的高性能推理!:tada:更多详细升级信息请查看[Release Note](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.2.0)
1817

1918

2019
## 简介

README_en.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ English | [简体中文](./README.md)
1313

1414
## News <img src="./docs/imgs/news_icon.png" width="40"/>
1515

16+
* [2021-12-12] PaddleNLP 2.2 has been officially relealsed! :tada: For more information please refer to [Release Note](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.2.0).
1617
* [2021-10-12] PaddleNLP 2.1 has been officially relealsed! :tada: For more information please refer to [Release Note](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.1.0).
1718

1819
## Introduction

paddlenlp/transformers/bart/modeling.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -767,6 +767,16 @@ def prepare_faster_entry(self, kwargs):
767767
raise AttributeError(
768768
"Only topk sampling or topp sampling are supported. " \
769769
"Topk sampling and topp sampling cannot be both applied in the faster version.")
770+
if kwargs['repetition_penalty'] != 1.0:
771+
# not support for repetition_penalty yet in the faster version
772+
raise AttributeError(
773+
"'repetition_penalty != 1' is not supported yet in the faster version"
774+
)
775+
if kwargs['forced_bos_token_id'] is not None:
776+
# not support for min_length yet in the faster version
777+
raise AttributeError(
778+
"'forced_bos_token_id != None' is not supported yet in the faster version"
779+
)
770780
self._faster_entry = FasterBART(
771781
self, use_fp16_decoding=use_fp16_decoding).forward
772782
return self._faster_entry

paddlenlp/transformers/generation_utils.py

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -501,11 +501,6 @@ def _build_faster(self, kwargs):
501501
# not support for min_length yet in the faster version
502502
raise AttributeError(
503503
"'min_length != 0' is not supported yet in the faster version")
504-
if kwargs['repetition_penalty'] != 1.0:
505-
# not support for repetition_penalty yet in the faster version
506-
raise AttributeError(
507-
"'repetition_penalty != 1' is not supported yet in the faster version"
508-
)
509504
if kwargs['num_beam_groups'] != 1:
510505
# not support for group_beam_search yet in the faster version
511506
raise AttributeError(
@@ -537,6 +532,7 @@ def generate(self,
537532
diversity_rate=0.0,
538533
use_cache=True,
539534
use_faster=False,
535+
use_fp16_decoding=False,
540536
**model_kwargs):
541537
r"""
542538
The interface for generation task. This method can generate sequences
@@ -605,7 +601,9 @@ def generate(self,
605601
use_cache: (bool, optional): Whether to use the model cache to
606602
speed up decoding. Default to True.
607603
use_faster: (bool, optional): Whether to use faster entry of model
608-
for generation. Default to False.
604+
for FasterGeneration. Default to False.
605+
use_fp16_decoding: (bool, optional): Whether to use fp16 for decoding.
606+
Only works when faster entry is avalible. Default to False.
609607
model_kwargs (dict): It can be used to specify additional kwargs
610608
passed to the model.
611609

paddlenlp/transformers/gpt/modeling.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1134,6 +1134,11 @@ def prepare_faster_entry(self, kwargs):
11341134
raise AttributeError(
11351135
"'size_per_head = %d' is not supported yet in the faster version of GPT"
11361136
% size_per_head)
1137+
if kwargs['forced_bos_token_id'] is not None:
1138+
# not support for min_length yet in the faster version
1139+
raise AttributeError(
1140+
"'forced_bos_token_id != None' is not supported yet in the faster version"
1141+
)
11371142
self._faster_entry = FasterGPT(
11381143
self, use_fp16_decoding=use_fp16_decoding).forward
11391144
return self._faster_entry

paddlenlp/transformers/mbart/modeling.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -846,6 +846,11 @@ def prepare_faster_entry(self, kwargs):
846846
raise AttributeError(
847847
"Only topk sampling or topp sampling are supported. " \
848848
"Topk sampling and topp sampling cannot be both applied in the faster version.")
849+
if kwargs['repetition_penalty'] != 1.0:
850+
# not support for repetition_penalty yet in the faster version
851+
raise AttributeError(
852+
"'repetition_penalty != 1' is not supported yet in the faster version"
853+
)
849854
self._faster_entry = FasterMBART(
850855
self, use_fp16_decoding=use_fp16_decoding).forward
851856
return self._faster_entry

paddlenlp/transformers/unified_transformer/modeling.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -485,6 +485,16 @@ def prepare_faster_entry(self, kwargs):
485485
raise AttributeError(
486486
"Only topk sampling or topp sampling are supported. " \
487487
"Topk sampling and topp sampling cannot be both applied in the faster version.")
488+
if kwargs['repetition_penalty'] != 1.0:
489+
# not support for repetition_penalty yet in the faster version
490+
raise AttributeError(
491+
"'repetition_penalty != 1' is not supported yet in the faster version"
492+
)
493+
if kwargs['forced_bos_token_id'] is not None:
494+
# not support for min_length yet in the faster version
495+
raise AttributeError(
496+
"'forced_bos_token_id != None' is not supported yet in the faster version"
497+
)
488498
self._faster_entry = FasterUnifiedTransformer(
489499
self, use_fp16_decoding=use_fp16_decoding).forward
490500
return self._faster_entry

paddlenlp/transformers/unimo/modeling.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -482,6 +482,16 @@ def prepare_faster_entry(self, kwargs):
482482
raise AttributeError(
483483
"Only topk sampling or topp sampling are supported. " \
484484
"Topk sampling and topp sampling cannot be both applied in the faster version.")
485+
if kwargs['repetition_penalty'] != 1.0:
486+
# not support for repetition_penalty yet in the faster version
487+
raise AttributeError(
488+
"'repetition_penalty != 1' is not supported yet in the faster version"
489+
)
490+
if kwargs['forced_bos_token_id'] is not None:
491+
# not support for min_length yet in the faster version
492+
raise AttributeError(
493+
"'forced_bos_token_id != None' is not supported yet in the faster version"
494+
)
485495
self._faster_entry = FasterUNIMOText(
486496
self, use_fp16_decoding=use_fp16_decoding).forward
487497
return self._faster_entry

0 commit comments

Comments
 (0)