|
| 1 | +<div align="center"> |
| 2 | + <div align="center"> |
| 3 | + <h1><b>📊 Table Structure Recognition</b></h1> |
| 4 | + </div> |
| 5 | + <a href=""><img src="https://img.shields.io/badge/Python->=3.6,<3.12-aff.svg"></a> |
| 6 | + <a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Mac%2C%20Win-pink.svg"></a> |
| 7 | +<a href="https://pypi.org/project/lineless-table-rec/"><img alt="PyPI" src="https://img.shields.io/pypi/v/lineless-table-rec"></a> |
| 8 | +<a href="https://pepy.tech/project/lineless-table-rec"><img src="https://static.pepy.tech/personalized-badge/lineless-table-rec?period=total&units=abbreviation&left_color=grey&right_color=blue&left_text=Downloads%20Lineless"></a> |
| 9 | +<a href="https://pepy.tech/project/wired-table-rec"><img src="https://static.pepy.tech/personalized-badge/wired-table-rec?period=total&units=abbreviation&left_color=grey&right_color=blue&left_text=Downloads%20Wired"></a> |
| 10 | + <a href="https://semver.org/"><img alt="SemVer2.0" src="https://img.shields.io/badge/SemVer-2.0-brightgreen"></a> |
| 11 | + <a href="https://github.com/psf/black"><img src="https://img.shields.io/badge/code%20style-black-000000.svg"></a> |
| 12 | + <a href="https://github.com/RapidAI/TableStructureRec/blob/c41bbd23898cb27a957ed962b0ffee3c74dfeff1/LICENSE"><img alt="GitHub" src="https://img.shields.io/badge/license-Apache%202.0-blue"></a> |
| 13 | +</div> |
| 14 | + |
| 15 | +### Recent Updates |
| 16 | +- **2024.10.22** |
| 17 | + - Added the complex background multi-table detection and extraction solution [RapidTableDet](https://github.com/RapidAI/RapidTableDetection). |
| 18 | + |
| 19 | +- **2024.10.29** |
| 20 | + - Retrained the table classifier using YOLO11 to fix the logic coordinate restoration error in wired_table_rec v2 and updated evaluations. |
| 21 | + |
| 22 | +- **2024.11.12** |
| 23 | + - Extracted model recognition and processing core thresholds for easier fine-tuning according to specific scenarios. See [Core Parameters](#core-parameters). |
| 24 | + |
| 25 | +### Introduction |
| 26 | +💖 This repository serves as an inference library for structured recognition of tables within documents, including models for wired and wireless table recognition from Alibaba DulaLight, a wired table model from llaipython (WeChat), and a built-in table classification model from NetEase Qanything. |
| 27 | + |
| 28 | +[Quick Start](#installation) [Model Evaluation](#evaluation-results) [Usage Recommendations](#usage-recommendations) [Table Rotation & Perspective Correction](#table-rotation-and-perspective-correction) [Fine-tuning Input Parameters Reference](#core-parameters) [Frequently Asked Questions](#faqs) [Update Plan](#update-plan) |
| 29 | +#### Features |
| 30 | + |
| 31 | +⚡ **Fast:** Uses ONNXRuntime as the inference engine, achieving 1-7 seconds per image on CPU. |
| 32 | + |
| 33 | +🎯 **Accurate:** Combines a table type classification model to distinguish between wired and wireless tables, providing more refined tasks and higher accuracy. |
| 34 | + |
| 35 | +🛡️ **Stable:** Does not depend on any third-party training frameworks; relies only on essential base libraries, avoiding package conflicts. |
| 36 | + |
| 37 | +### Online Demonstrations |
| 38 | +[modelscope魔搭](https://www.modelscope.cn/studios/jockerK/TableRec) [huggingface](https://huggingface.co/spaces/Joker1212/TableDetAndRec) |
| 39 | + |
| 40 | +### Effect Showcase |
| 41 | + |
| 42 | +<div align="center"> |
| 43 | + <img src="https://github.com/RapidAI/TableStructureRec/releases/download/v0.0.0/demo_img_output.gif" alt="Demo" width="100%" height="100%"> |
| 44 | +</div> |
| 45 | + |
| 46 | +### Evaluation Results |
| 47 | + |
| 48 | +[TableRecognitionMetric Evaluation Tool](https://github.com/SWHL/TableRecognitionMetric) |
| 49 | +[huggingface Dataset](https://huggingface.co/datasets/SWHL/table_rec_test_dataset) |
| 50 | +[modelscope Dataset](https://www.modelscope.cn/datasets/jockerK/TEDS_TEST/files) |
| 51 | +[Rapid OCR](https://github.com/RapidAI/RapidOCR) |
| 52 | + |
| 53 | +Test Environment: Ubuntu 20.04, Python 3.10.10, opencv-python 4.10.0.84 |
| 54 | + |
| 55 | +Note: |
| 56 | +StructEqTable outputs in LaTeX format.测评仅选取成功转换为 HTML and stripped of style tags. |
| 57 | + |
| 58 | +Surya-Tabled uses its built-in OCR module, which is a row-column recognition model and cannot identify cell merges, resulting in lower scores. |
| 59 | + |
| 60 | +| Method | TEDS | TEDS-only-structure | |
| 61 | +|:------------------------------------------------------------------------------------------------|:-----------:|:-------------------:| |
| 62 | +| [surya-tabled(--skip-detect)](https://github.com/VikParuchuri/tabled) | 0.33437 | 0.65865 | |
| 63 | +| [surya-tabled](https://github.com/VikParuchuri/tabled) | 0.33940 | 0.67103 | |
| 64 | +| [deepdoctection(rag-flow)](https://github.com/deepdoctection/deepdoctection?tab=readme-ov-file) | 0.59975 | 0.69918 | |
| 65 | +| [ppstructure_table_master](https://github.com/PaddlePaddle/PaddleOCR/tree/main/ppstructure) | 0.61606 | 0.73892 | |
| 66 | +| [ppsturcture_table_engine](https://github.com/PaddlePaddle/PaddleOCR/tree/main/ppstructure) | 0.67924 | 0.78653 | |
| 67 | +| [StructEqTable](https://github.com/UniModal4Reasoning/StructEqTable-Deploy) | 0.67310 | 0.81210 | |
| 68 | +| [RapidTable(SLANet)](https://github.com/RapidAI/RapidTable) | 0.71654 | 0.81067 | |
| 69 | +| table_cls + wired_table_rec v1 + lineless_table_rec | 0.75288 | 0.82574 | |
| 70 | +| table_cls + wired_table_rec v2 + lineless_table_rec | 0.77676 | 0.84580 | |
| 71 | +| [RapidTable(SLANet-plus)](https://github.com/RapidAI/RapidTable) | **0.84481** | **0.91369** | |
| 72 | + |
| 73 | +### Usage Recommendations |
| 74 | +wired_table_rec_v2 (highest precision for wired tables): General scenes for wired tables (papers, magazines, journals, receipts, invoices, bills) |
| 75 | + |
| 76 | +paddlex-SLANet-plus (highest overall precision): Document scene tables (tables in papers, magazines, and journals) [Fine-tuning Input Parameters Reference](#core-parameters) |
| 77 | + |
| 78 | +### Installation |
| 79 | + |
| 80 | +```python |
| 81 | +pip install wired_table_rec lineless_table_rec table_cls |
| 82 | +``` |
| 83 | + |
| 84 | +### Quick start |
| 85 | + |
| 86 | +``` python {linenos=table} |
| 87 | +import os |
| 88 | + |
| 89 | +from lineless_table_rec import LinelessTableRecognition |
| 90 | +from lineless_table_rec.utils_table_recover import format_html, plot_rec_box_with_logic_info, plot_rec_box |
| 91 | +from table_cls import TableCls |
| 92 | +from wired_table_rec import WiredTableRecognition |
| 93 | + |
| 94 | +lineless_engine = LinelessTableRecognition() |
| 95 | +wired_engine = WiredTableRecognition() |
| 96 | +# Default small YOLO model (0.1s), can switch to higher precision YOLOX (0.25s), or faster QAnything (0.07s) model |
| 97 | +table_cls = TableCls() # TableCls(model_type="yolox"),TableCls(model_type="q") |
| 98 | +img_path = f'images/img14.jpg' |
| 99 | + |
| 100 | +cls,elasp = table_cls(img_path) |
| 101 | +if cls == 'wired': |
| 102 | + table_engine = wired_engine |
| 103 | +else: |
| 104 | + table_engine = lineless_engine |
| 105 | + |
| 106 | +html, elasp, polygons, logic_points, ocr_res = table_engine(img_path) |
| 107 | +print(f"elasp: {elasp}") |
| 108 | + |
| 109 | +# Use other OCR models |
| 110 | +#ocr_engine =RapidOCR(det_model_dir="xxx/det_server_infer.onnx",rec_model_dir="xxx/rec_server_infer.onnx") |
| 111 | +#ocr_res, _ = ocr_engine(img_path) |
| 112 | +#html, elasp, polygons, logic_points, ocr_res = table_engine(img_path, ocr_result=ocr_res) |
| 113 | + |
| 114 | +# output_dir = f'outputs' |
| 115 | +# complete_html = format_html(html) |
| 116 | +# os.makedirs(os.path.dirname(f"{output_dir}/table.html"), exist_ok=True) |
| 117 | +# with open(f"{output_dir}/table.html", "w", encoding="utf-8") as file: |
| 118 | +# file.write(complete_html) |
| 119 | +# Visualize table recognition boxes + logical row and column information |
| 120 | +# plot_rec_box_with_logic_info( |
| 121 | +# img_path, f"{output_dir}/table_rec_box.jpg", logic_points, polygons |
| 122 | +# ) |
| 123 | +# Visualize OCR recognition boxes |
| 124 | +# plot_rec_box(img_path, f"{output_dir}/ocr_box.jpg", ocr_res) |
| 125 | +``` |
| 126 | + |
| 127 | +#### Table Rotation and Perspective Correction |
| 128 | +##### 1. Simple Background, Small Angle Scene |
| 129 | +```python |
| 130 | +import cv2 |
| 131 | + |
| 132 | +img_path = f'tests/test_files/wired/squeeze_error.jpeg' |
| 133 | +from wired_table_rec.utils import ImageOrientationCorrector |
| 134 | + |
| 135 | +img_orientation_corrector = ImageOrientationCorrector() |
| 136 | +img = cv2.imread(img_path) |
| 137 | +img = img_orientation_corrector(img) |
| 138 | +cv2.imwrite(f'img_rotated.jpg', img) |
| 139 | +``` |
| 140 | +##### 2. Complex Background, Multiple Tables Scene |
| 141 | +For GPU or higher precision scenarios, please refer to the [RapidTableDet](https://github.com/RapidAI/RapidTableDetection) project. |
| 142 | +```python |
| 143 | +pip install rapid-table-det |
| 144 | +``` |
| 145 | +```python |
| 146 | +import os |
| 147 | +import cv2 |
| 148 | +from rapid_table_det.utils import img_loader, visuallize, extract_table_img |
| 149 | +from rapid_table_det.inference import TableDetector |
| 150 | +table_det = TableDetector() |
| 151 | +img_path = f"tests/test_files/chip.jpg" |
| 152 | +result, elapse = table_det(img_path) |
| 153 | +img = img_loader(img_path) |
| 154 | +extract_img = img.copy() |
| 155 | +#There may be multiple tables |
| 156 | +for i, res in enumerate(result): |
| 157 | + box = res["box"] |
| 158 | + lt, rt, rb, lb = res["lt"], res["rt"], res["rb"], res["lb"] |
| 159 | + # Recognition box and top-left corner position |
| 160 | + img = visuallize(img, box, lt, rt, rb, lb) |
| 161 | + # Perspective transformation to extract table image |
| 162 | + wrapped_img = extract_table_img(extract_img.copy(), lt, rt, rb, lb) |
| 163 | +# cv2.imwrite(f"{out_dir}/{file_name}-extract-{i}.jpg", wrapped_img) |
| 164 | +# cv2.imwrite(f"{out_dir}/{file_name}-visualize.jpg", img) |
| 165 | +``` |
| 166 | + |
| 167 | +### Core Parameters |
| 168 | +```python |
| 169 | +wired_table_rec = WiredTableRecognition() |
| 170 | +html, elasp, polygons, logic_points, ocr_res = wired_table_rec( |
| 171 | + img_path, |
| 172 | + version="v2", # Default to use v2 line model, switch to Alibaba ReadLight model by changing to v1 |
| 173 | + morph_close=True,# Whether to perform morphological operations to find more lines, default is True |
| 174 | + more_h_lines=True, # Whether to check for more horizontal lines based on line detection results to find smaller lines, default is True |
| 175 | + h_lines_threshold = 100, # Must enable more_h_lines, threshold for connecting horizontal line detection pixels, new horizontal lines will be generated if below this value, default is 100 |
| 176 | + more_v_lines=True, # Whether to check for more vertical lines based on line detection results to find smaller lines, default is True |
| 177 | + v_lines_threshold = 15, # Must enable more_v_lines, threshold for connecting vertical line detection pixels, new vertical lines will be generated if below this value, default is 15 |
| 178 | + extend_line=True, # Whether to extend line segments based on line detection results to find more lines, default is True |
| 179 | + need_ocr=True, # Whether to perform OCR recognition, default is True |
| 180 | + rec_again=True,# Whether to re-recognize table boxes that were not recognized, default is True |
| 181 | +) |
| 182 | +lineless_table_rec = LinelessTableRecognition() |
| 183 | +html, elasp, polygons, logic_points, ocr_res = lineless_table_rec( |
| 184 | + need_ocr=True, # Whether to perform OCR recognition, default is True |
| 185 | + rec_again=True, # Whether to re-recognize table boxes that were not recognized, default is True |
| 186 | +) |
| 187 | +``` |
| 188 | + |
| 189 | +## FAQ |
| 190 | +1. **Q: The recognition box lost internal text information** |
| 191 | + - **A: The default small RapidOCR model is used. If you need higher precision, you can download a higher precision OCR model from the [model list](https://rapidai.github.io/RapidOCRDocs/model_list/#_1) and pass it in during execution, or try adjusting the parameters of RapidOCR according to the online demo, [modelscope](https://www.modelscope.cn/studios/liekkas/RapidOCRDemo/summary) [huggingface](https://huggingface.co/spaces/SWHL/RapidOCRDemo) |
| 192 | +2. **Q: Does the model support GPU acceleration?** |
| 193 | + - **A: Currently, the inference of the table model is very fast, with wired tables at the 100ms level and wireless tables at the 500ms level. The main time consumption is in the OCR stage. You can refer to [rapidocr_paddle](https://rapidai.github.io/RapidOCRDocs/install_usage/rapidocr_paddle/usage/#_3) to accelerate the OCR recognition process. |
| 194 | + |
| 195 | +### Update Plan |
| 196 | + |
| 197 | +- [x] Add methods for correcting small-angle image offsets |
| 198 | +- [x] Increase dataset size and add more evaluation comparisons |
| 199 | +- [x] Add complex scene table detection and extraction to solve low recognition rates caused by rotation and perspective |
| 200 | +- [x] Optimize the table classifier |
| 201 | +- [ ] Optimize the wireless table model |
| 202 | + |
| 203 | +### Processing Workflow |
| 204 | + |
| 205 | +```mermaid |
| 206 | +A[/Table Image/] --> B([Table Classification table_cls]) B --> C([Wired Table Recognition wired_table_rec]) & D([Wireless Table Recognition lineless_table_rec]) --> E([Text Recognition rapidocr_onnxruntime]) E --> F[/HTML Structured Output/] |
| 207 | +``` |
| 208 | + |
| 209 | +### Acknowledgments |
| 210 | + |
| 211 | + |
| 212 | +[PaddleX Table Recognition](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/module_usage/tutorials/ocr_modules/table_structure_recognition.md) |
| 213 | + |
| 214 | +[PaddleOCR Table Recognition](https://github.com/PaddlePaddle/PaddleOCR/blob/4b17511491adcfd0f3e2970895d06814d1ce56cc/ppstructure/table/README_ch.md) |
| 215 | + |
| 216 | +[Damo Academy - Table Structure Recognition - Wired Table](https://www.modelscope.cn/models/damo/cv_dla34_table-structure-recognition_cycle-centernet/summary) |
| 217 | + |
| 218 | +[Damo Academy - Table Structure Recognition - Wireless Table](https://www.modelscope.cn/models/damo/cv_resnet-transformer_table-structure-recognition_lore/summary) |
| 219 | + |
| 220 | +[Qanything-RAG](https://github.com/netease-youdao/QAnything) |
| 221 | + |
| 222 | +Special thanks to llaipython (WeChat, providing a full suite of high-precision table extraction services) for providing the high-precision wired table model. |
| 223 | + |
| 224 | +Special thanks to [MajexH](https://github.com/MajexH) for completing the table recognition test using deepdoctection (rag-flow). |
| 225 | + |
| 226 | +### Contribution Guidelines |
| 227 | + |
| 228 | +Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. |
| 229 | + |
| 230 | +Please ensure appropriate updates to tests. |
| 231 | + |
| 232 | +### [Sponsor](https://rapidai.github.io/Knowledge-QA-LLM/docs/sponsor/) |
| 233 | + |
| 234 | +If you want to sponsor this project, you can directly click the Sponsor button at the top of the current page. Please write a note (**Your Github account name**) to facilitate adding to the sponsor list. |
| 235 | + |
| 236 | +### Open Source License |
| 237 | + |
| 238 | +This project is licensed under the [Apache 2.0](https://github.com/RapidAI/TableStructureRec/blob/c41bbd23898cb27a957ed962b0ffee3c74dfeff1/LICENSE) open source license. |
0 commit comments