You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|[Distill-LSTM](./examples/model_compression/distill_lstm/)| 基于[Distilling Task-Specific Knowledge from BERT into Simple Neural Networks](https://arxiv.org/abs/1903.12136)论文策略的实现,将BERT中英文分类的下游模型知识通过蒸馏的方式迁移至LSTM的小模型结构中,取得比LSTM单独训练更好的效果。|
|[TinyBERT](./examples/model_compression/tinybert)| 基于论文[TinyBERT: Distilling BERT for Natural Language Understanding](https://arxiv.org/abs/1909.10351)的实现,提供了通用蒸馏和下游任务蒸馏的脚本。本实例利用开源模型`tinybert-6l-768d-v2`初始化,在GLUE的7个数据集上进行下游任务的蒸馏,最终模型参数量缩小1/2,预测速度提升2倍,同时保证模型精度几乎无损,其中精度可达教师模型`bert-base-uncased`的 98.90%。 |
0 commit comments