You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
使用教程给的代码(如下)在最新版本中执行会报:Traceback (most recent call last): File "/root/autodl-tmp/testgpu.py", line 76, in f.write(markdown_texts) TypeError: write() argument must be str, not MarkdownResult
发现markdown_texts为MarkdownResult类型,之前的版本为str类型,这要怎么修改,如果直接改成str(markdown_texts)发现输出的文件里面是json数据而非md数据。
from pathlib import Path
from paddleocr import PPStructureV3
with open(mkd_file_path, "w", encoding="utf-8") as f:
f.write(markdown_texts)
for item in markdown_images:
if item:
for path, image in item.items():
file_path = output_path / path
file_path.parent.mkdir(parents=True, exist_ok=True)
image.save(file_path)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
使用教程给的代码(如下)在最新版本中执行会报:Traceback (most recent call last): File "/root/autodl-tmp/testgpu.py", line 76, in f.write(markdown_texts) TypeError: write() argument must be str, not MarkdownResult
发现markdown_texts为MarkdownResult类型,之前的版本为str类型,这要怎么修改,如果直接改成str(markdown_texts)发现输出的文件里面是json数据而非md数据。
from pathlib import Path
from paddleocr import PPStructureV3
input_file = "./your_pdf_file.pdf"
output_path = Path("./output")
pipeline = PPStructureV3()
output = pipeline.predict(input=input_file)
markdown_list = []
markdown_images = []
for res in output:
md_info = res.markdown
markdown_list.append(md_info)
markdown_images.append(md_info.get("markdown_images", {}))
markdown_texts = pipeline.concatenate_markdown_pages(markdown_list)
mkd_file_path = output_path / f"{Path(input_file).stem}.md"
mkd_file_path.parent.mkdir(parents=True, exist_ok=True)
with open(mkd_file_path, "w", encoding="utf-8") as f:
f.write(markdown_texts)
for item in markdown_images:
if item:
for path, image in item.items():
file_path = output_path / path
file_path.parent.mkdir(parents=True, exist_ok=True)
image.save(file_path)
Beta Was this translation helpful? Give feedback.
All reactions