Skip to content

代码块识别和排版 #4691

@WZZMMM

Description

@WZZMMM

🔎 Search before asking | 提交之前请先搜索

  • I have searched the MinerU Readme and found no similar bug report.
  • I have searched the MinerU Issues and found no similar bug report.
  • I have searched the MinerU Discussions and found no similar bug report.

🤖 Consult the online AI assistant for assistance | 在线 AI 助手咨询

Description of the bug | 错误描述

pdf文档内的代码块,
被识别为“代码描述”,但是在markdown文档没有放进代码块里;
仍然有不换行的情况,也基本上没有缩进;
但是会保留代码行号。

How to reproduce the bug | 如何复现

  1. 找任意一个latex排版的、包含代码块的pdf文档
  2. 通过调用API或者电脑桌面端上传,模型选择MinerU VLM
  3. 查看转换后的markdown文档

Operating System Mode | 操作系统类型

No response

Operating System Version| 操作系统版本

Windows 11

Python version | Python 版本

No response

Software version | 软件版本 (mineru --version)

No response

Backend name | 解析后端

No response

Device mode | 设备模式

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions