@@ -47,13 +47,41 @@ PP-OCRv3系统pipeline如下:
47
47
<div align =" center " >
48
48
<img src="../ppocrv3_framework.png" width="800">
49
49
</div >
50
-
51
50
更多细节请参考[ PP-OCRv3技术报告] ( https://arxiv.org/abs/2206.03001v2 ) 👉[ 中文简洁版] ( ./PP-OCRv3_introduction.md )
52
51
52
+ PP-OCRv4在PP-OCRv3的基础上进一步升级。整体的框架图保持了与PP-OCRv3相同的pipeline,针对检测模型和识别模型进行了数据、网络结构、训练策略等多个模块的优化。 PP-OCRv4系统框图如下所示:
53
+
54
+ <div align =" center " >
55
+ <img src="../ppocrv4_framework.png" width="800">
56
+ </div >
57
+
58
+ 从算法改进思路上看,分别针对检测和识别模型,进行了共10个方面的改进:
59
+
60
+ - 检测模块:
61
+ - LCNetV3:精度更高的骨干网络
62
+ - PFHead:并行head分支融合结构
63
+ - DSR: 训练中动态增加shrink ratio
64
+ - CML:添加Student和Teacher网络输出的KL div loss
65
+ - 识别模块:
66
+ - SVTR_LCNetV3:精度更高的骨干网络
67
+ - Lite-Neck:精简的Neck结构
68
+ - GTC-NRTR:稳定的Attention指导分支
69
+ - Multi-Scale:多尺度训练策略
70
+ - DF: 数据挖掘方案
71
+ - DKD :DKD蒸馏策略
72
+
73
+ 从效果上看,速度可比情况下,多种场景精度均有大幅提升:
74
+
75
+ - 中文场景,相对于PP-OCRv3中文模型提升超4%;
76
+ - 英文数字场景,相比于PP-OCRv3英文模型提升6%;
77
+ - 多语言场景,优化80个语种识别效果,平均准确率提升超8%。
78
+ - 更多细节请参考 👉[ 中文简洁版] ( ./PP-OCRv4_introduction.md )
53
79
54
80
<a name =" 2 " ></a >
81
+
55
82
## 2. 特性
56
83
84
+ - 超轻量PP-OCRv4系列:检测(4.7M)+ 方向分类器(1.4M)+ 识别(10M)= 16.1M
57
85
- 超轻量PP-OCRv3系列:检测(3.6M)+ 方向分类器(1.4M)+ 识别(12M)= 17.0M
58
86
- 超轻量PP-OCRv2系列:检测(3.1M)+ 方向分类器(1.4M)+ 识别(8.5M)= 13.0M
59
87
- 超轻量PP-OCR mobile移动端系列:检测(3.0M)+方向分类器(1.4M)+ 识别(5.0M)= 9.4M
@@ -118,10 +146,11 @@ PP-OCR中英文模型列表如下:
118
146
119
147
| 模型简介 | 模型名称 | 推荐场景 | 检测模型 | 方向分类器 | 识别模型 |
120
148
| ------------------------------------- | ----------------------- | --------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
149
+ | 中英文超轻量 PP-OCRv4 模型(15.8M) | ch_PP-OCRv4_xx | 移动端&服务器端 | [ 推理模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_rec_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_rec_train.tar ) |
121
150
| 中英文超轻量PP-OCRv3模型(16.2M) | ch_PP-OCRv3_xx | 移动端&服务器端 | [ 推理模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar ) |
122
- | 英文超轻量PP-OCRv3模型(13.4M) | en_PP-OCRv3_xx | 移动端&服务器端 | [ 推理模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_distill_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar ) |
151
+ | 英文超轻量PP-OCRv3模型(13.4M) | en_PP-OCRv3_xx | 移动端&服务器端 | [ 推理模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_distill_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar ) |
123
152
| 中英文超轻量PP-OCRv2模型(13.0M) | ch_PP-OCRv2_xx | 移动端&服务器端 | [ 推理模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar ) / [ 训练模型] ( https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar ) |
124
153
| 中英文超轻量PP-OCR mobile模型(9.4M) | ch_ppocr_mobile_v2.0_xx | 移动端&服务器端 | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar ) / [ 预训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar ) / [ 预训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar ) / [ 预训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar ) |
125
- | 中英文通用PP-OCR server模型(143.4M) | ch_ppocr_server_v2.0_xx | 服务器端 | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar ) / [ 预训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar ) / [ 预训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar ) / [ 预训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar ) |
154
+ | 中英文通用PP-OCR server模型(143.4M) | ch_ppocr_server_v2.0_xx | 服务器端 | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar ) / [ 预训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar ) / [ 预训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar ) | [ 推理模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar ) / [ 预训练模型] ( https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar ) |
126
155
127
156
更多模型下载(包括英文数字模型、多语言模型、Paddle-Lite模型等),可以参考[ PP-OCR 系列模型下载] ( ./models_list.md ) 。
0 commit comments