怎么将minerU.do_parse解析得到的bbox转成 fitz 库的bbox? #4624
Unanswered
1571859588
asked this question in
Q&A
Replies: 1 comment
-
|
Hi @1571859588! I'm Dosu and I’m helping the MinerU team. MinerU 的 bbox 是基于图像像素坐标(原点在左上角),而 fitz 使用 PDF 点坐标(原点在左下角),所以需要进行两步转换: 转换公式: def mineru_bbox_to_fitz(mineru_bbox, page_height, dpi=200):
"""
将 MinerU bbox 转换为 fitz bbox
Args:
mineru_bbox: (x0, y0, x1, y1) MinerU 输出的像素坐标
page_height: PDF 页面高度(点),可通过 fitz 获取
dpi: MinerU 渲染使用的 DPI(默认 200)
"""
scale = 72 / dpi # 像素 → PDF 点
# 1. DPI 缩放
x0 = mineru_bbox[0] * scale
y0 = mineru_bbox[1] * scale
x1 = mineru_bbox[2] * scale
y1 = mineru_bbox[3] * scale
# 2. Y 轴翻转(MinerU 原点在左上,fitz 原点在左下)
fitz_y0 = page_height - y1
fitz_y1 = page_height - y0
return (x0, fitz_y0, x1, fitz_y1)关键点:
注意事项: To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
我发现minerU 的工作流程: PDF → pypdfium2渲染 → 完整图像 → YOLO检测 → 图像坐标
导致最终的bbox与直接对pdf使用 fitz 库得到的bbox完全不一样
请问有没有能根据最终的bbox通过一些方法反推回原本的pdf的fitz库的bbox的方法? 使minerU输出的bbox能够映射回原来的fitz库的bbox坐标
Beta Was this translation helpful? Give feedback.
All reactions