Releases: opendatalab/MinerU
Releases · opendatalab/MinerU
mineru-2.7.6-released
What's Changed
-
2026/02/06 2.7.6 Release
- Added support for the domestic computing platforms Kunlunxin and Tecorigin.
-
2026/02/06 2.7.6 发布
- 新增国产算力平台昆仑芯、太初元碁的适配支持,目前已由官方和厂商适配并支持的国产算力平台包括:
- MinerU 持续兼容国产硬件平台,支持主流芯片架构。以安全可靠的技术,助力科研、政企用户迈向文档数字化新高度!
New Contributors
Full Changelog: mineru-2.7.5-released...mineru-2.7.6-released
mineru-2.7.5-released
What's Changed
- Fix the issue where PDF rendering timeout detection fails under certain conditions.
Full Changelog: mineru-2.7.4-released...mineru-2.7.5-released
mineru-2.7.4-released
What's Changed
-
2026/01/30 2.7.4 Release
- Added support for domestic computing platforms IluvatarCorex and Cambricon.
-
2026/01/30 2.7.4 发布
- 新增国产算力平台天数智芯、寒武纪的适配支持,目前已由官方适配并支持的国产算力平台包括:
- MinerU 持续兼容国产硬件平台,支持主流芯片架构。以安全可靠的技术,助力科研、政企用户迈向文档数字化新高度!
New Contributors
- @pgoslatara made their first contribution in #4421
- @Copilot made their first contribution in #4434
- @guguducken made their first contribution in #4435
Full Changelog: mineru-2.7.3-released...mineru-2.7.4-released
mineru-2.7.3-released
mineru-2.7.2-released
What's Changed
-
2026/01/23 2.7.2 Release
- Cross-page table merging optimization, improving merge success rate and merge quality
-
2026/01/23 2.7.2 发布
- 新增国产算力平台海光、燧原、摩尔线程的适配支持,目前已由官方适配并支持的国产算力平台包括:
- MinerU 持续兼容国产硬件平台,支持主流芯片架构。以安全可靠的技术,助力科研、政企用户迈向文档数字化新高度!
- 跨页表合并优化,提升合并成功率与合并效果
New Contributors
- @tommygood made their first contribution in #4365
Full Changelog: mineru-2.7.1-released...mineru-2.7.2-released
mineru-2.7.1-released
What's Changed
-
2026/01/06 2.7.1 Release
- fix bug: #4300
- Updated pdfminer.six dependency version to resolve CVE-2025-64512
- Support automatic correction of input image exif orientation to improve OCR recognition accuracy #4283
-
2026/01/06 2.7.1 发布
- fix bug: #4300
- 更新pdfminer.six的依赖版本以解决 CVE-2025-64512
- 支持输入图像的exif方向自动校正,提升OCR识别效果 #4283
New Contributors
- @kingdomad made their first contribution in #4283
Full Changelog: mineru-2.7.0-released...mineru-2.7.1-released
mineru-2.7.0-released
What's Changed
-
2025/12/30 2.7.0 Release
- Simplified installation process. No need to separately install
vlmacceleration engine dependencies. Usinguv pip install mineru[all]during installation will install all optional backend dependencies. - Added new
hybridbackend, which combines the advantages ofpipelineandvlmbackends. Built on vlm, it integrates some capabilities of pipeline, adding extra extensibility on top of high accuracy:- Directly extracts text from text PDFs, natively supports multi-language recognition in text PDF scenarios, and greatly reduces parsing hallucinations;
- Supports text recognition in 109 languages for scanned PDF scenarios by specifying OCR language;
- Independent inline formula recognition switch, which can be disabled separately when inline formula recognition is not needed, improving the visual effect of parsing results.
- Simplified engine selection logic for
vlm/hybridbackends. Users only need to specify the backend as*-auto-engine, and the system will automatically select the appropriate engine for inference acceleration based on the current environment, improving usability. - Switched default parsing backend from
pipelinetohybrid-auto-engine, improving out-of-the-box result consistency for new users and avoiding cognitive differences in parsing results. - Added i18n support to gradio application, supporting switching between Chinese and English languages.
- Simplified installation process. No need to separately install
-
2025/12/30 2.7.0 发布
- 简化安装流程,现在不再需要单独安装
vlm加速引擎依赖包,安装时使用uv pip install mineru[all]即可安装所有可选后端的依赖包。 - 增加全新后端
hybrid,该后端结合了pipeline和vlm后端的优势,在vlm的基础上,融入了pipeline的部分能力,在高精度的基础上增加了额外的扩展性:- 从文本pdf中直接抽取文本,在文本pdf场景原生支持多语言识别,并极大减少解析幻觉;
- 通过指定ocr语言,在扫描pdf场景下支持109种语言的文本识别;
- 独立的行内公式识别开关,在不需要行内公式识别的场景下可单独关闭,提升解析结果视觉效果。
- 简化
vlm/hybrid后端的引擎选择逻辑,用户只需指定后端为*-auto-engine,系统会根据当前环境自动选择合适的引擎进行推理加速,提升易用性. - 默认解析后端从
pipeline切换至hybrid-auto-engine,提升新用户开箱即用的结果一致性,避免出现解析结果认知差异。 - gradio应用增加i18n适配,支持中英文两种语言切换。
- 简化安装流程,现在不再需要单独安装
Full Changelog: mineru-2.6.8-released...mineru-2.7.0-released
mineru-2.6.8-released
mineru-2.6.7-released
mineru-2.6.6-released
What's Changed
-
2025/12/02 2.6.6 Release
mineru-apitool optimizations- Added descriptive text to
mineru-apiinterface parameters to improve API documentation readability. - You can use the environment variable
MINERU_API_ENABLE_FASTAPI_DOCSto control whether the auto-generated interface documentation page is enabled (enabled by default). - Added concurrency configuration options for the
vlm-vllm-async-engine,vlm-lmdeploy-engine, andvlm-http-clientbackends. Users can use the environment variableMINERU_API_MAX_CONCURRENT_REQUESTSto set the maximum number of concurrent API requests (unlimited by default).
- Added descriptive text to
-
2025/12/02 2.6.6 发布
Ascend适配优化- 优化命令行工具初始化流程,使Ascend适配方案中
vlm-vllm-engine后端在命令行工具中可用。 - 为Atlas 300I Duo(310p)设备更新适配文档。
- 优化命令行工具初始化流程,使Ascend适配方案中
mineru-api工具优化- 为
mineru-api接口参数增加描述性文本,优化接口文档可读性。 - 可通过环境变量
MINERU_API_ENABLE_FASTAPI_DOCS控制是否启用自动生成的接口文档页面,默认为启用。 - 为
vlm-vllm-async-engine、vlm-lmdeploy-engine、vlm-http-client后端增加并发数配置选项,用户可通过环境变量MINERU_API_MAX_CONCURRENT_REQUESTS控制api接口的最大并发请求数,默认为不限制数量。
- 为
New Contributors
Full Changelog: mineru-2.6.5-released...mineru-2.6.6-released