Replies: 1 comment
-
我也遇到相同情况,在 Lib\site-packages\paddleocr\paddleocr.py 中,可以找到这句报错 “error in layout recovery”,通过添加 raise ex 抛出异常,定位到出现问题的地方是 paddleocr\ppstructure\recovery\recoverty_to_doc.py 的 parser.handle_table 一行,猜测是在表格恢复时出现的问题,通过 try ... except 这行,可以暂时规避这个错误 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🔎 Search before asking
🐛 Bug (问题描述)
我是百度飞浆aistudio平台,使用版面恢复命令操作的。完整命令是:
%cd /home/aistudio/PaddleOCR
! paddleocr --image_dir=/home/aistudio/PaddleOCR/1.pdf --type=structure --recovery=true --lang='ch' --output=/home/aistudio/PaddleOCR/hjb004/
日志报错如下:

没有shenghc生成对应的docx文件,我也查看了相关issue,请问是版本问题吗?我的相关版本如下,是不是paddleocr和paddlepaddle版本不对导致的?
paddle2onnx==1.2.4
paddleclas==2.5.2
paddlefsl==1.1.0
paddlehub==2.4.0
paddlenlp @ https://files.pythonhosted.org/packages/44/62/98dd0ca2f6600ca1dfc9c59ba1b40628df5f7948abc85ba16c3367c49cf4/paddlenlp-2.8.1-py3-none-any.whl#sha256=8cb5324ee5c39d29264ec5049ea5a6beeebb04a625b5a5c519b92475ea7d067f
paddleocr==2.8.1
paddlepaddle-gpu @ file:///tmp/paddlepaddle_gpu-3.0.0b1-cp310-cp310-linux_x86_64.whl#sha256=24dc65caf3a70796d287544dcba162e4c940d2536cd76a26cd94276cbd215727
🏃♂️ Environment (运行环境)
百度飞浆aistudio平台
aiofiles==23.2.1
aiohttp==3.9.5
aiosignal==1.3.1
aistudio-sdk @ file:///home/aistudio/aistudio_sdk-0.2.4-py3-none-any.whl#sha256=d93411cc8764e465860cbf2f97f787dddd1548595d4776c97ddf0ea787dedd81
albucore==0.0.13
albumentations==1.4.10
altair==4.2.2
annotated-types==0.7.0
anyio==4.4.0
astor==0.8.1
asttokens==2.4.1
async-timeout==4.0.3
attrs==23.2.0
Babel==2.15.0
bce-python-sdk==0.9.17
beautifulsoup4==4.12.3
blinker==1.8.2
cachetools==5.3.3
certifi==2024.7.4
charset-normalizer==3.3.2
ci-info==0.3.0
click==8.1.7
colorama==0.4.6
coloredlogs==15.0.1
colorlog==6.8.2
comm==0.2.2
configobj==5.0.8
configparser==7.1.0
contourpy==1.2.1
cssselect==1.2.0
cssutils==2.11.1
cycler==0.12.1
Cython==3.0.11
datasets==2.20.0
debugpy==1.8.2
decorator==5.1.1
dill==0.3.4
dnspython==2.6.1
easydict==1.13
email_validator==2.2.0
entrypoints==0.4
et-xmlfile==1.1.0
etelemetry==0.3.1
exceptiongroup==1.2.1
executing==2.0.1
faiss-cpu==1.8.0.post1
fastapi==0.111.0
fastapi-cli==0.0.4
ffmpy==0.3.2
filelock==3.15.4
fire==0.6.0
fitz==0.0.1.dev2
Flask==3.0.3
flask-babel==4.0.0
flatbuffers==24.3.25
fonttools==4.53.0
frozenlist==1.4.1
fsspec==2024.5.0
future==1.0.0
gast==0.3.3
gitdb==4.0.11
GitPython==3.1.43
gradio==3.40.0
gradio_client==1.0.2
gunicorn==22.0.0
h11==0.14.0
httpcore==1.0.5
httplib2==0.22.0
httptools==0.6.1
httpx==0.27.0
huggingface-hub==0.23.4
humanfriendly==10.0
idna==3.7
imageio==2.35.1
imgaug==0.4.0
importlib_metadata==8.0.0
importlib_resources==6.4.0
ipykernel==6.29.5
ipython==8.26.0
isodate==0.6.1
itsdangerous==2.2.0
jedi==0.19.1
jieba==0.42.1
Jinja2==3.1.4
joblib==1.4.2
jsonschema==4.22.0
jsonschema-specifications==2023.12.1
jupyter_client==8.6.2
jupyter_core==5.7.2
kiwisolver==1.4.5
lap==0.4.0
lazy_loader==0.4
linkify-it-py==2.0.3
lmdb==1.5.1
looseversion==1.3.0
lxml==5.3.0
markdown-it-py==2.2.0
MarkupSafe==2.1.5
matplotlib==3.9.1
matplotlib-inline==0.1.7
mdit-py-plugins==0.3.3
mdurl==0.1.2
more-itertools==10.4.0
motmetrics==1.4.0
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.12.2
nest-asyncio==1.6.0
networkx==3.3
nibabel==5.2.1
nipype==1.8.6
numpy==1.26.4
nvidia-cublas-cu11==11.11.3.6
nvidia-cuda-cupti-cu11==11.8.87
nvidia-cuda-nvrtc-cu11==11.8.89
nvidia-cuda-runtime-cu11==11.8.89
nvidia-cudnn-cu11==8.7.0.84
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.3.0.86
nvidia-cusolver-cu11==11.4.1.48
nvidia-cusparse-cu11==11.7.5.86
nvidia-nccl-cu11==2.19.3
nvidia-nvtx-cu11==11.8.86
onnx==1.16.1
onnxruntime==1.18.1
opencv-contrib-python==4.10.0.84
opencv-python==4.6.0.66
opencv-python-headless==4.10.0.84
openpyxl==3.1.5
opt-einsum==3.3.0
orjson==3.10.6
packaging==24.1
paddle2onnx==1.2.4
paddleclas==2.5.2
paddlefsl==1.1.0
paddlehub==2.4.0
paddlenlp @ https://files.pythonhosted.org/packages/44/62/98dd0ca2f6600ca1dfc9c59ba1b40628df5f7948abc85ba16c3367c49cf4/paddlenlp-2.8.1-py3-none-any.whl#sha256=8cb5324ee5c39d29264ec5049ea5a6beeebb04a625b5a5c519b92475ea7d067f
paddleocr==2.8.1
paddlepaddle-gpu @ file:///tmp/paddlepaddle_gpu-3.0.0b1-cp310-cp310-linux_x86_64.whl#sha256=24dc65caf3a70796d287544dcba162e4c940d2536cd76a26cd94276cbd215727
pandas==2.2.2
parso==0.8.4
pathlib==1.0.1
pexpect==4.9.0
pickleshare==0.7.5
pillow==10.4.0
platformdirs==4.2.2
premailer==3.10.0
prettytable==3.10.0
prompt_toolkit==3.0.47
protobuf==3.20.3
prov==2.0.1
psutil==6.0.0
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==16.1.0
pyarrow-hotfix==0.6
pybind11==2.13.1
pyclipper==1.3.0.post5
pycocotools==2.0.8
pycryptodome==3.20.0
pydantic==2.8.2
pydantic_core==2.20.1
pydeck==0.9.1
pydot==3.0.1
pydub==0.25.1
Pygments==2.18.0
Pympler==1.1
PyMuPDF==1.24.9
PyMuPDFb==1.24.9
pyparsing==3.1.2
python-dateutil==2.9.0.post0
python-docx==1.1.2
python-dotenv==1.0.1
python-multipart==0.0.9
pytz==2024.1
pyxnat==1.6.2
PyYAML==6.0.1
pyzmq==26.0.3
rapidfuzz==3.9.6
rarfile==4.2
rdflib==6.3.2
referencing==0.35.1
regex==2024.5.15
requests==2.32.3
rich==13.7.1
rpds-py==0.18.1
ruff==0.5.0
safetensors==0.4.3
scikit-image==0.24.0
scikit-learn==1.5.1
scipy==1.14.0
semantic-version==2.10.0
semver==3.0.2
sentencepiece==0.2.0
seqeval==1.2.2
shapely==2.0.6
shellingham==1.5.4
simplejson==3.19.3
six==1.16.0
sklearn==0.0
smmap==5.0.1
sniffio==1.3.1
soupsieve==2.6
stack-data==0.6.3
starlette==0.37.2
streamlit==1.13.0
streamlit-image-comparison==0.0.4
sympy==1.12.1
termcolor==2.4.0
terminaltables==3.1.10
threadpoolctl==3.5.0
tifffile==2024.8.10
toml==0.10.2
tomli==2.0.1
tomlkit==0.12.0
tool-helpers==0.1.1
toolz==0.12.1
tornado==6.4.1
tqdm==4.66.4
traitlets==5.14.3
traits==6.3.2
typeguard==4.3.0
typer==0.12.3
typing_extensions==4.12.2
tzdata==2024.1
tzlocal==5.2
uc-micro-py==1.0.3
ujson==5.10.0
urllib3==2.2.2
uvicorn==0.30.1
uvloop==0.19.0
validators==0.30.0
visualdl==2.5.3
watchdog==4.0.1
watchfiles==0.22.0
wcwidth==0.2.13
websockets==11.0.3
Werkzeug==3.0.3
xmltodict==0.13.0
xxhash==3.4.1
yarl==1.9.4
zipp==3.19.2
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
执行的命令:
%cd /home/aistudio/PaddleOCR
! paddleocr --image_dir=/home/aistudio/PaddleOCR/1.pdf --type=structure --recovery=true --lang='ch' --output=/home/aistudio/PaddleOCR/hjb004/
报错的日志:
[2024/08/20 22:12:26] ppocr INFO: /home/aistudio/PaddleOCR/1.pdf
[2024/08/20 22:12:27] ppocr INFO: processing 1/4 page:
[2024/08/20 22:12:28] ppocr DEBUG: dt_boxes num : 83, elapsed : 0.09140348434448242
[2024/08/20 22:12:29] ppocr DEBUG: rec_res num : 83, elapsed : 0.3126399517059326
[2024/08/20 22:12:29] ppocr INFO: processing 2/4 page:
[2024/08/20 22:12:29] ppocr DEBUG: dt_boxes num : 72, elapsed : 0.04991507530212402
[2024/08/20 22:12:29] ppocr DEBUG: rec_res num : 72, elapsed : 0.20769500732421875
[2024/08/20 22:12:30] ppocr DEBUG: dt_boxes num : 20, elapse : 0.03148174285888672
[2024/08/20 22:12:30] ppocr DEBUG: rec_res num : 20, elapse : 0.05368828773498535
[2024/08/20 22:12:30] ppocr DEBUG: dt_boxes num : 14, elapse : 0.025783061981201172
[2024/08/20 22:12:30] ppocr DEBUG: rec_res num : 14, elapse : 0.0349116325378418
[2024/08/20 22:12:30] ppocr INFO: processing 3/4 page:
[2024/08/20 22:12:30] ppocr DEBUG: dt_boxes num : 79, elapsed : 0.05122065544128418
[2024/08/20 22:12:31] ppocr DEBUG: rec_res num : 79, elapsed : 0.14891624450683594
[2024/08/20 22:12:31] ppocr DEBUG: dt_boxes num : 18, elapse : 0.03324317932128906
[2024/08/20 22:12:31] ppocr DEBUG: rec_res num : 18, elapse : 0.038069725036621094
[2024/08/20 22:12:31] ppocr DEBUG: dt_boxes num : 18, elapse : 0.021338939666748047
[2024/08/20 22:12:32] ppocr DEBUG: rec_res num : 18, elapse : 0.03579115867614746
[2024/08/20 22:12:32] ppocr DEBUG: dt_boxes num : 18, elapse : 0.021355390548706055
[2024/08/20 22:12:32] ppocr DEBUG: rec_res num : 18, elapse : 0.03148841857910156
[2024/08/20 22:12:32] ppocr INFO: processing 4/4 page:
[2024/08/20 22:12:32] ppocr DEBUG: dt_boxes num : 58, elapsed : 0.045165300369262695
[2024/08/20 22:12:32] ppocr DEBUG: rec_res num : 58, elapsed : 0.10314536094665527
[2024/08/20 22:12:33] ppocr DEBUG: dt_boxes num : 18, elapse : 0.021555423736572266
[2024/08/20 22:12:33] ppocr DEBUG: rec_res num : 18, elapse : 0.03125762939453125
[2024/08/20 22:12:33] ppocr DEBUG: dt_boxes num : 18, elapse : 0.021595001220703125
[2024/08/20 22:12:33] ppocr DEBUG: rec_res num : 18, elapse : 0.03625941276550293
[2024/08/20 22:12:33] ppocr ERROR: error in layout recovery image:1, err msg: list index out of range
Beta Was this translation helpful? Give feedback.
All reactions