Skip to content

PPStructureV3 某些图片报错 #17068

@chop2

Description

@chop2

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

File "/data1/workspace/LLM/file_parser/app/parser_core/paddleocr_adapter.py", line 81, in do_parse_paddle
output = pipeline.predict(input=tmp_path)
│ │ └ '/tmp/tmp4ojyw0ic.pdf'
│ └ <function PPStructureV3.predict at 0x7fbb225367a0>
└ <paddleocr._pipelines.pp_structurev3.PPStructureV3 object at 0x7fbce0a8a530>

File "/data1/workspace/LLM/file_parser/venv/lib/python3.10/site-packages/paddleocr/_pipelines/pp_structurev3.py", line 250, in predict
return list(
File "/data1/workspace/LLM/file_parser/venv/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py", line 129, in predict
yield from self._pipeline.predict(
│ │ └ <function _LayoutParsingPipelineV2.predict at 0x7fbb237f0940>
│ └ <paddlex.inference.pipelines.layout_parsing.pipeline_v2._LayoutParsingPipelineV2 object at 0x7fba10f31ff0>
└ <paddlex.inference.pipelines.layout_parsing.pipeline_v2.LayoutParsingPipelineV2 object at 0x7fba10f33400>
File "/data1/workspace/LLM/file_parser/venv/lib/python3.10/site-packages/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py", line 1245, in predict
parsing_res_list = self.get_layout_parsing_res(
│ └ <function _LayoutParsingPipelineV2.get_layout_parsing_res at 0x7fbb237f0820>
└ <paddlex.inference.pipelines.layout_parsing.pipeline_v2._LayoutParsingPipelineV2 object at 0x7fba10f31ff0>
File "/data1/workspace/LLM/file_parser/venv/lib/python3.10/site-packages/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py", line 811, in get_layout_parsing_res
self.standardized_data(
│ └ <function _LayoutParsingPipelineV2.standardized_data at 0x7fbb237f0670>
└ <paddlex.inference.pipelines.layout_parsing.pipeline_v2._LayoutParsingPipelineV2 object at 0x7fba10f31ff0>
File "/data1/workspace/LLM/file_parser/venv/lib/python3.10/site-packages/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py", line 496, in standardized_data
layout_det_res["boxes"].append(
└ {'input_path': None, 'page_index': None, 'input_img': array([[[221, 228, 222],
[221, 228, 222],
[221, 229, 22...

AttributeError: 'numpy.ndarray' object has no attribute 'append'

报错

🏃‍♂️ Environment (运行环境)

paddleocr                 3.3.1
paddlepaddle-gpu          3.2.1
paddlex                   3.3.9
numpy                     2.2.6

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

def test_paddleocr_structure():
    from pathlib import Path
    from paddleocr import PPStructureV3
    from paddlex.inference.pipelines.layout_parsing.result_v2 import LayoutParsingResultV2

    pipeline = PPStructureV3(
        use_doc_orientation_classify=False,
        use_doc_unwarping=False
    )

    # For Image
    output = pipeline.predict(
        input="https://ofasys-multimodal-wlcb-3-toshanghai.oss-accelerate.aliyuncs.com/wpf272043/keepme/image/receipt.png",
        )

    # 可视化结果并保存 json 结果
    for res in output:
        res:LayoutParsingResultV2 = res
        print(res)
        # res.print() 
        res.save_to_json(save_path="output") 
        res.save_to_markdown(save_path="output") 

Metadata

Metadata

Assignees

Labels

task/inferenceRelated to model inference or prediction.

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions