Skip to content

PaddleOCR-VL解析json数据中图片block没有图片地址 #17143

@eleking-wd

Description

@eleking-wd

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

通过官方示例,直接解析pdf文件,输出的json数据里,image block不包含图片的地址;但是markdown文本里却有图片信息。这样根本不方便做溯源

🏃‍♂️ Environment (运行环境)

Ubuntu 22.04.2 LTS
CUDA 12.4
PaddleOCR 3.3.1
Python 3.10.12
GPU RTX3090

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

随便找一个包含图片内容的pdf进行解析,保存json数据

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions