Skip to content

Commit f6ef577

Browse files
authored
polish docs (#16365)
* polish docs * update
1 parent 35c1ed5 commit f6ef577

File tree

15 files changed

+1053
-38
lines changed

15 files changed

+1053
-38
lines changed

README.md

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ In addition to providing an outstanding model library, PaddleOCR 3.0 also offers
7979
- The high-stability service-oriented deployment solution also supports invocation via manually constructed HTTP requests, enabling client-side code development in any programming language.
8080

8181
- **Benchmark Support:**
82-
- **All production lines now support fine-grained benchmarking, enabling measurement of end-to-end inference time as well as per-layer and per-module latency data to assist with performance analysis.**
82+
- **All production lines now support fine-grained benchmarking, enabling measurement of end-to-end inference time as well as per-layer and per-module latency data to assist with performance analysis. [Here's](docs/version3.x/pipeline_usage/instructions/benchmark.en.md) how to set up and use the benchmark feature.**
8383
- **Documentation has been updated to include key metrics for commonly used configurations on mainstream hardware, such as inference latency and memory usage, providing deployment references for users.**
8484

8585
- **Bug Fixes:**
@@ -213,10 +213,21 @@ In addition to providing an outstanding model library, PaddleOCR 3.0 also offers
213213
Install PaddlePaddle refer to [Installation Guide](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/develop/install/pip/linux-pip_en.html), after then, install the PaddleOCR toolkit.
214214

215215
```bash
216-
# Install paddleocr
217-
pip install paddleocr
216+
# If you only want to use the basic text recognition feature (returns text position coordinates and content), including the PP-OCR series
217+
python -m pip install paddleocr
218+
# If you want to use all features such as document parsing, document understanding, document translation, key information extraction, etc.
219+
# python -m pip install "paddleocr[all]"
218220
```
219221

222+
Starting from version 3.2.0, in addition to the `all` dependency group demonstrated above, PaddleOCR also supports installing partial optional features by specifying other dependency groups. All dependency groups provided by PaddleOCR are as follows:
223+
224+
| Dependency Group Name | Corresponding Functionality |
225+
| - | - |
226+
| `doc-parser` | Document parsing: can be used to extract layout elements such as tables, formulas, stamps, images, etc. from documents; includes models like PP-StructureV3 |
227+
| `ie` | Information extraction: can be used to extract key information from documents, such as names, dates, addresses, amounts, etc.; includes models like PP-ChatOCRv4 |
228+
| `trans` | Document translation: can be used to translate documents from one language to another; includes models like PP-DocTranslation |
229+
| `all` | Complete functionality |
230+
220231
### 3. Run inference by CLI
221232
```bash
222233
# Run PP-OCRv5 inference

docs/version3.x/installation.en.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -120,12 +120,12 @@ python -m pip install "paddleocr@git+https://github.com/PaddlePaddle/PaddleOCR.g
120120

121121
In addition to the `all` dependency group demonstrated above, PaddleOCR also supports installing specific optional features by specifying other dependency groups. The available dependency groups provided by PaddleOCR are as follows:
122122

123-
| Dependency Group | Functionality |
124-
| ---------------- | ------------------------ |
125-
| `doc-parser` | Document parsing, which can be used to extract layout elements in a document such as tables, formulas, stamps, and images. |
126-
| `ie` | Information extraction, which can be used to extract key information from documents, such as names, dates, addresses, amounts, and more. |
127-
| `trans` | Document translation, which can be used to translate a document from one language to another. |
128-
| `all` | Full functionality. |
123+
| Dependency Group Name | Corresponding Functionality |
124+
| - | - |
125+
| `doc-parser` | Document parsing: can be used to extract layout elements such as tables, formulas, stamps, images, etc. from documents; includes models like PP-StructureV3 |
126+
| `ie` | Information extraction: can be used to extract key information from documents, such as names, dates, addresses, amounts, etc.; includes models like PP-ChatOCRv4 |
127+
| `trans` | Document translation: can be used to translate documents from one language to another; includes models like PP-DocTranslation |
128+
| `all` | Complete functionality |
129129

130130
The general OCR pipeline (e.g., PP-OCRv3/v4/v5) and the document image preprocessing pipeline can be used without installing any additional dependency groups. Apart from these two pipelines, each remaining pipeline belongs to one and only one dependency group. You can refer to the usage documentation of each pipeline to determine which group it belongs to. For individual functional modules, installing any dependency group that includes the module will enable access to its core functionality.
131131

docs/version3.x/installation.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -119,9 +119,9 @@ python -m pip install "paddleocr@git+https://github.com/PaddlePaddle/PaddleOCR.g
119119

120120
| 依赖组名称 | 对应的功能 |
121121
| - | - |
122-
| `doc-parser` | 文档解析,可用于提取文档中的表格、公式、印章、图片等版面元素 |
123-
| `ie` | 信息抽取,可用于从文档中提取关键信息,如姓名、日期、地址、金额等 |
124-
| `trans` | 文档翻译,可用于将文档从一种语言翻译为另一种语言 |
122+
| `doc-parser` | 文档解析,可用于提取文档中的表格、公式、印章、图片等版面元素,包含 PP-StructureV3 等模型方案 |
123+
| `ie` | 信息抽取,可用于从文档中提取关键信息,如姓名、日期、地址、金额等,包含 PP-ChatOCRv4 等模型方案 |
124+
| `trans` | 文档翻译,可用于将文档从一种语言翻译为另一种语言,包含 PP-DocTranslation 等模型方案 |
125125
| `all` | 完整功能 |
126126

127127
通用 OCR 产线(如 PP-OCRv3/v4/v5)、文档图像预处理产线的功能无需安装额外的依赖组即可使用。除了这两条产线外,每一条产线属于且仅属于一个依赖组。在各产线的使用文档中可以了解产线属于哪一依赖组。对于单功能模块,安装任意包含该模块的产线对应的依赖组后即可使用相关的基础功能。
File renamed without changes.

0 commit comments

Comments
 (0)