RapidAI
diff --git a/‎docs/blog/posts/about_model/adapt_PP-OCRv5_mobile_det.md‎
Lines changed: 79 additions & 98 deletions b/‎docs/blog/posts/about_model/adapt_PP-OCRv5_mobile_det.md‎
Lines changed: 79 additions & 98 deletions
diff --git a/‎docs/blog/posts/images/v4_v5_mobile_det.png‎
270 KB b/‎docs/blog/posts/images/v4_v5_mobile_det.png‎
270 KB
diff --git a/‎docs/blog/posts/images/v5_mobile_det_vis_result.jpg‎
53.8 KB b/‎docs/blog/posts/images/v5_mobile_det_vis_result.jpg‎
53.8 KB
@@ -39,10 +39,6 @@ pip install "paddlex[ocr]==3.0.0rc1"
 
 测试PP-OCRv5_mobile_det模型能否正常识别：
 
-测试用图：
-
-![alt text](../images/1.jpg)
-
 !!! tip
 
     运行以下代码时，模型会自动下载到 **/Users/用户名/.paddlex/official_models** 下。
@@ -79,59 +75,114 @@ for res in result:
 PaddleX官方集成了paddle2onnx的转换代码：
 
 ```bash linenums="1"
-paddlex --paddle2onnx --paddle_model_dir models/PP-OCRv5_mobile_det --onnx_model_dir models/PP-OCRv5_mobile_det
+paddle2onnx --model_dir models/official_models/PP-OCRv5_mobile_det --model_filename inference.json --params_filename inference.pdiparams --save_file models/PP-OCRv5_mobile_det/inference.onnx
 ```
 
-输出日志如下，表明转换成功：
+输出日志如下，日志中存在报错信息，但是最终ONNX模型仍然生成了：
 
-```bash linenums="1"
-Input dir: models/PP-OCRv5_mobile_det
-Output dir: models/PP-OCRv5_mobile_det
-Paddle2ONNX conversion starting...
+```bash linenums="1" hl_lines="11 16"
+/Users/xxxx/miniconda3/envs/py310/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:711: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
   warnings.warn(warning_message)
 [Paddle2ONNX] Start parsing the Paddle model file...
-[Paddle2ONNX] Use opset_version = 7 for ONNX export.
+[Paddle2ONNX] Use opset_version = 14 for ONNX export.
 [Paddle2ONNX] PaddlePaddle model is exported as ONNX format now.
-2025-05-14 08:21:23 [INFO]      Try to perform optimization on the ONNX model with onnxoptimizer.
-2025-05-14 08:21:23 [INFO]      ONNX model saved in models/PP-OCRv5_mobile_det/inference.onnx.
-Paddle2ONNX conversion succeeded
-Done
+2025-05-26 11:20:46 [INFO]      Try to perform constant folding on the ONNX model with Polygraphy.
+[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
+[I] Folding Constants | Pass 1
+[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
+[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
+[ONNXRuntimeError] : 1 : FAIL : /Users/runner/work/1/s/onnxruntime/core/graph/model.cc:182 onnxruntime::Model::Model(ModelProto &&, const PathString &, const IOnnxRuntimeOpSchemaRegistryList *, const logging::Logger &, const ModelOptions &) Unsupported model IR version: 11, max supported IR version: 10
+[I]     Total Nodes | Original:   925, After Folding:   612 |   313 Nodes Folded
+[I] Folding Constants | Pass 2
+[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
+[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
+[ONNXRuntimeError] : 1 : FAIL : /Users/runner/work/1/s/onnxruntime/core/graph/model.cc:182 onnxruntime::Model::Model(ModelProto &&, const PathString &, const IOnnxRuntimeOpSchemaRegistryList *, const logging::Logger &, const ModelOptions &) Unsupported model IR version: 11, max supported IR version: 10
+[I]     Total Nodes | Original:   612, After Folding:   612 |     0 Nodes Folded
+2025-05-26 11:20:52 [INFO]      ONNX model saved in models/PP-OCRv5_mobile_det/inference.onnx.
 ```
 
-### 3. 模型推理验证
+此时得到的模型，直接用`rapidocr`推理会报错：
+
+```python linenums="1"
+from rapidocr import RapidOCR
 
-该部分主要是在RapidOCR项目中测试能否直接使用onnx模型。要点主要是确定模型前后处理是否兼容。从PaddleX[官方文档](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/text_recognition.html#_2)中可以看到：
+model_path = "models/PP-OCRv5_mobile_det/inference.onnx"
+engine = RapidOCR(params={"Det.model_path": model_path})
 
-> PP-OCRv5_mobile_det是在PP-OCRv4_server_rec的基础上，在更多中文文档数据和PP-OCR训练数据的混合数据训练而成，增加了部分繁体字、日文、特殊字符的识别能力，可支持识别的字符为1.5万+，除文档相关的文字识别能力提升外，也同时提升了通用文字的识别能力
+img_url = "https://img1.baidu.com/it/u=3619974146,1266987475&fm=253&fmt=auto&app=138&f=JPEG?w=500&h=516"
+result = engine(img_url)
+print(result)
 
-以上说明了该模型与PP-OCRv4_server_rec模型结构相同，前后处理也相同。唯一做的就是添加了更多数据，扩展了字典个数，从6623扩展到15630个。因此，可以直接使用RapidOCR来快速推理验证。代码如下：
+result.vis("vis_result.jpg")
+```
+
+报错信息如下：
+
+```bash linenums="1" hl_lines="15"
+[INFO] 2025-05-26 11:21:27,698 [RapidOCR] base.py:41: Using engine_name: onnxruntime
+Traceback (most recent call last):
+  File "/Users/xxxx/projects/RapidOCR/python/demo.py", line 9, in <module>
+    engine = RapidOCR(params={"Det.model_path": model_path})
+  File "/Users/xxxx/projects/RapidOCR/python/rapidocr/main.py", line 60, in __init__
+    self.text_det = TextDetector(config.Det)
+  File "/Users/xxxx/projects/RapidOCR/python/rapidocr/ch_ppocr_det/main.py", line 45, in __init__
+    self.session = get_engine(config.engine_name)(config)
+  File "/Users/xxxx/projects/RapidOCR/python/rapidocr/inference_engine/onnxruntime.py", line 60, in __init__
+    self.session = InferenceSession(
+  File "/Users/xxxx/miniconda3/envs/py310/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 472, in __init__
+    self._create_inference_session(providers, provider_options, disabled_optimizers)
+  File "/Users/xxxx/miniconda3/envs/py310/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 550, in _create_inference_session
+    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
+onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from /Users/xxxx/projects/LittleCode/models/PP-OCRv5_mobile_det/inference.onnx failed:/Users/runner/work/1/s/onnxruntime/core/graph/model.cc:182 onnxruntime::Model::Model(ModelProto &&, const PathString &, const IOnnxRuntimeOpSchemaRegistryList *, const logging::Logger &, const ModelOptions &) Unsupported model IR version: 11, max supported IR version: 10
+```
+
+经过一系列的查阅资料，终于在onnxruntime issue [#23602](https://github.com/microsoft/onnxruntime/issues/23602#issuecomment-2642348849) 中发现了解决方案。运行下面代码，将上一步所得模型重新指定一下**IR_VERSION**，就可以用`rapidocr`加载推理了。
+
+```python linenums="1"
+import onnx
+from onnx import version_converter
+
+OPT_VERSION = 14
+IR_VERSION = 10
+
+source_path = "models/PP-OCRv5_mobile_det/inference.onnx"
+dist_path = "models/PP-OCRv5_mobile_det/inference_v2.onnx"
+
+model = onnx.load(source_path)
+model.ir_version = IR_VERSION
+model = version_converter.convert_version(model, OPT_VERSION)
+onnx.save(model, dist_path)
+```
+
+### 3. 模型推理验证
+
+该部分主要是在RapidOCR项目中测试能否直接使用onnx模型。要点主要是确定模型前后处理是否兼容。从PaddleOCR config文件中比较[PP-OCRv4](https://github.com/PaddlePaddle/PaddleOCR/blob/549d83a88b7c75144120e6ec03de80d3eb9e48a5/configs/det/PP-OCRv4/PP-OCRv4_mobile_det.yml)和[PP-OCRv5 mobile det](https://github.com/PaddlePaddle/PaddleOCR/blob/549d83a88b7c75144120e6ec03de80d3eb9e48a5/configs/det/PP-OCRv5/PP-OCRv5_mobile_det.yml)文件差异：
+
+![alt text](../images/v4_v5_mobile_det.png)
 
 ```python linenums="1"
 from rapidocr import RapidOCR
 
 model_path = "models/PP-OCRv5_mobile_det/inference.onnx"
-key_path = "models/ppocrv4_doc_dict.txt"
-engine = RapidOCR(params={"Rec.model_path": model_path, "Rec.rec_keys_path": key_path})
+engine = RapidOCR(params={"Det.model_path": model_path})
 
 img_url = "https://img1.baidu.com/it/u=3619974146,1266987475&fm=253&fmt=auto&app=138&f=JPEG?w=500&h=516"
-result = engine(img_path)
+result = engine(img_url)
 print(result)
 
 result.vis("vis_result.jpg")
 ```
 
-![alt text](../images/vis_result.jpg)
+![alt text](../images/v5_mobile_det_vis_result.jpg)
 
 ### 4. 模型精度测试
 
-该部分主要使用[TextRecMetric](https://github.com/SWHL/TextRecMetric)和测试集[text_rec_test_dataset](https://huggingface.co/datasets/SWHL/text_rec_test_dataset)来评测。
-
-需要注意的是，**PP-OCRv5_mobile_det模型更加侧重生僻字和一些符号识别。** 当前测试集并未着重收集生僻字和一些符号的数据，因此以下指标会有些偏低。如需自己使用，请在自己场景下测试效果。
+该部分主要使用[TextDetMetric](https://github.com/SWHL/TextDetMetric)和测试集[text_det_test_dataset](https://huggingface.co/datasets/SWHL/text_det_test_dataset)来评测。
 
-相关测试步骤请参见[TextRecMetric](https://github.com/SWHL/TextRecMetric)的README，一步一步来就行。我这里测试最终精度如下：
+相关测试步骤请参见[TextDetMetric](https://github.com/SWHL/TextRecMetric)的README，一步一步来就行。我这里测试最终精度如下：
 
 ```json
-{'ExactMatch': 0.8097, 'CharMatch': 0.9444, 'avg_elapse': 0.0818}
+{'precision': 0.7861, 'recall': 0.8266, 'hmean': 0.8058, 'avg_elapse': 0.1499}
 ```
 
 该结果已经更新到[开源OCR模型对比](./model_summary.md)中。
@@ -140,76 +191,6 @@ result.vis("vis_result.jpg")
 
 该部分主要包括将字典文件写入到ONNX模型中、托管模型到魔搭、更改rapidocr代码适配等。
 
-#### 字典文件写入ONNX模型
-
-该步骤仅存在文本识别模型中，文本检测模型没有这个步骤。
-
-??? info "详细代码"
-
-    ```python linenums="1"
-    from pathlib import Path
-    from typing import List, Union
-
-    import onnx
-    import onnxruntime as ort
-    from onnx import ModelProto
-
-
-    def read_txt(txt_path: Union[Path, str]) -> List[str]:
-        with open(txt_path, "r", encoding="utf-8") as f:
-            data = [v.rstrip("\n") for v in f]
-        return data
-
-
-    class ONNXMetaOp:
-        @classmethod
-        def add_meta(
-            cls,
-            model_path: Union[str, Path],
-            key: str,
-            value: List[str],
-            delimiter: str = "\n",
-        ) -> ModelProto:
-            model = onnx.load_model(model_path)
-            meta = model.metadata_props.add()
-            meta.key = key
-            meta.value = delimiter.join(value)
-            return model
-
-        @classmethod
-        def get_meta(
-            cls, model_path: Union[str, Path], key: str, split_sym: str = "\n"
-        ) -> List[str]:
-            sess = ort.InferenceSession(model_path)
-            meta_map = sess.get_modelmeta().custom_metadata_map
-            key_content = meta_map.get(key)
-            key_list = key_content.split(split_sym)
-            return key_list
-
-        @classmethod
-        def del_meta(cls, model_path: Union[str, Path]) -> ModelProto:
-            model = onnx.load_model(model_path)
-            del model.metadata_props[:]
-            return model
-
-        @classmethod
-        def save_model(cls, save_path: Union[str, Path], model: ModelProto):
-            onnx.save_model(model, save_path)
-
-
-    dicts = read_txt(
-        "models/ppocrv4_doc_dict.txt"
-    )
-    model_path = "models/PP-OCRv5_mobile_det.onnx"
-    model = ONNXMetaOp.add_meta(model_path, key="character", value=dicts)
-
-    new_model_path = "models/PP-OCRv5_mobile_det_with_dict.onnx"
-    ONNXMetaOp.save_model(new_model_path, model)
-
-    t = ONNXMetaOp.get_meta(new_model_path, key="character")
-    print(t)
-    ```
-
 #### 托管模型到魔搭
 
 该部分主要是涉及模型上传到对应位置，并合理命名。注意上传完成后，需要打Tag，避免后续rapidocr whl包中找不到模型下载路径。