Improve AMD ROCm Pipeline Performance & Update Dify Plugin for Hybrid Mode Support #4414

ChenxiWu-Lab · 2026-01-25T12:49:31Z

ChenxiWu-Lab
Jan 25, 2026

📌 Background / Context

Hi MinerU team,

First of all, thank you for the great work on MinerU.
I am currently deploying MinerU in a production-like research environment using AMD ROCm GPUs, and I would like to request two improvements that would significantly enhance usability and performance on non-CUDA platforms.

My current setup (for reference):

CPU: AMD Threadripper 3970X (32C / 64T)

GPU: AMD Radeon R9700 (ROCm, 32GB VRAM)

RAM: 64GB

Deployment: Docker (ROCm image)

Use case: Large PDF document parsing (Layout + OCR + Tables)

MinerU mode: Pipeline / VLM / Hybrid (attempted)

🚀 Request 1: Improve AMD ROCm Pipeline Compatibility & Performance
Problem

When running MinerU pipeline mode on AMD GPUs, I consistently observe the following MIOpen warnings during Layout Predict and OCR stages:

MIOpen(HIP): Warning [IsEnoughWorkspace] Solver ,
workspace required: XXX, provided ptr: 0 size: 0

This happens even when:

privileged: true

ipc: host

memlock: -1

Large shm_size

Explicitly setting:

MIOPEN_WORKSPACE_LIMIT

MIOPEN_MAX_WORKSPACE_SIZE

Persistent MIOpen cache & DB

GPU memory is clearly sufficient (30+ GB available)

From testing and profiling, it appears that:

Certain Layout / OCR kernels fall back to no-workspace solvers

This results in significantly slower inference on ROCm, even though hardware resources are available

The issue seems to originate at the framework / kernel selection level, not Docker or cgroup limits

Expected / Requested Improvements

Better ROCm-specific kernel selection for:

Layout detection

OCR detection / recognition

Improved MIOpen workspace usage where possible

Optional ROCm tuning presets for pipeline mode (batch size, solver hints, etc.)

Even a 10–20% improvement here would make a big difference for AMD users.

🔀 Request 2: Update Dify Plugin to Support Hybrid (VLM + Pipeline) Mode
Problem

MinerU already supports a hybrid architecture (VLM for layout + pipeline for OCR), which is extremely valuable for balancing accuracy vs performance.

However, in the current official Dify plugin:

Hybrid-related parameters are not exposed

Passing hybrid flags via variables:

Works only occasionally

Often falls back silently to full VLM mode

This makes it very difficult to reliably use hybrid mode in automated workflows

Requested Improvements

Update the official Dify plugin to:

Expose hybrid / pipeline / VLM selection explicitly

Allow stable configuration of:

Layout via VLM

OCR via pipeline

Ensure the plugin behavior matches the latest MinerU backend capabilities

This would greatly improve MinerU’s usability in workflow-based deployments and production systems.

💡 Why This Matters

AMD ROCm users are increasingly common in research and on-prem deployments

Pipeline + Hybrid mode is the most cost-effective way to scale MinerU

These improvements would:

Increase performance

Reduce GPU cost

Broaden MinerU’s hardware ecosystem

I’m happy to provide logs, benchmarks, or help test changes if needed.

Thanks again for your work on MinerU 🙏

myhloli · 2026-01-26T02:31:19Z

myhloli
Jan 26, 2026
Maintainer

#3662

3 replies

ChenxiWu-Lab Jan 26, 2026
Author

Thank you for your reply. I will try the method you sent me. However, I still hope that the official support for AMD graphics cards and RoCM can be improved (after all, they have started to support so much Chinese domestic computing power). Also, you may not have noticed, there is also an update request for the DIFY plugin. Currently, the minerU plugin for DIFY is only available for versions prior to 2.6, and the current hybrid mode cannot be selected (although it seems this can be solved by manually passing in variables).

healy-hub Jan 26, 2026

谢谢你的回复。我会试试你给我的方法。不过，我仍然希望官方对AMD显卡和RoCM的支持能得到提升（毕竟，它们已经开始支持大量中国国内计算能力）。另外，你可能没注意到，还有一个关于DIFY插件的更新请求。目前，DIFY 的 minerU 插件仅适用于 2.6 之前的版本，且当前的混合模式无法选择（不过似乎可以通过手动传递变量解决这个问题）。

这个问题我刚解决，ROCm RDNA pipeline后端的全速实现，layout和ocr的加速，我得整理一下再发一个教程，之前的vllm也是我写的，先看一下7900xtx pipeline后端优化后的速度，刚跑的一个200页的PDF测速：
Layout Predict: 100%|█████████████████████████████████| 200/200 [00:10<00:00, 18.81it/s]
MFD Predict: 100%|██████████████████████████████████| 200/200 [00:09<00:00, 21.82it/s]
MFR Predict: 100%|██████████████████████████████████| 430/430 [00:04<00:00, 106.36it/s]
Table-ocr det: 100%|█████████████████████████████████| 142/142 [00:01<00:00, 127.44it/s]
Table-ocr rec ch: 100%|███████████████████████████████| 881/881 [00:02<00:00, 409.11it/s]
Table-wireless Predict: 100%|████████████████████████████| 141/141 [00:01<00:00, 71.38it/s]
Table-wired Predict: 100%|████████████████████████████| 117/117 [00:03<00:00, 30.32it/s]
OCR-det Predict: 100%|██████████████████████████████| 200/200 [00:14<00:00, 14.22it/s]
Processing pages: 100%|█████████████████████████████| 200/200 [00:08<00:00, 24.86it/s]
OCR-rec Predict: 100%|██████████████████████████████| 20/20 [00:00<00:00, 422.94it/s]

ChenxiWu-Lab Jan 26, 2026
Author

Wow, thank you so much! Looking forward to your tutorial.
I've also tried optimizing it myself, but the results weren't ideal...

INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2026-01-26 04:32:43.260 | WARNING | mineru.utils.pdf_page_id:get_end_page_id:8 - end_page_id is out of range, use images length
Start MinerU FastAPI Service: http://0.0.0.0:8000
API documentation: http://0.0.0.0:8000/docs
Creating new Ultralytics Settings v0.0.6 file ✅
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.
2026-01-26 04:32:50.426 | INFO | mineru.backend.pipeline.pipeline_analyze:doc_analyze:129 - Batch 1/1: 42 pages/42 pages
2026-01-26 04:32:50.518 | INFO | mineru.backend.pipeline.pipeline_analyze:batch_image_analyze:189 - GPU Memory: 30 GB, Batch Ratio: 16.
2026-01-26 04:32:50.519 | INFO | mineru.backend.pipeline.model_init:init:209 - DocAnalysis init, this may take some times......
2026-01-26 04:32:57.241 | INFO | mineru.backend.pipeline.model_init:init:271 - DocAnalysis init done!
2026-01-26 04:32:57.241 | INFO | mineru.backend.pipeline.pipeline_analyze:custom_model_init:65 - model init cost: 6.722492218017578
Layout Predict: 0%| | 0/42 [00:00<?, ?it/s]MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 245760000, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 245760000, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 245760000, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 245760000, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 44236800, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 44236800, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 44236800, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 44236800, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 178176000, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 178176000, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 178176000, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 178176000, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 32071680, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 32071680, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 32071680, provided ptr: 0 size: 0
MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 32071680, provided ptr: 0 size: 0
Layout Predict: 100%|██████████| 42/42 [00:46<00:00, 1.12s/it]
MFD Predict: 100%|██████████| 42/42 [00:58<00:00, 1.40s/it]
MFR Predict: 100%|██████████| 70/70 [00:09<00:00, 7.76it/s]
Table-ocr det: 100%|██████████| 18/18 [00:06<00:00, 2.72it/s]
Table-ocr rec ch: 98%|█████████▊| 2718/2762 [00:36<00:08, 5.28it/s] INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve AMD ROCm Pipeline Performance & Update Dify Plugin for Hybrid Mode Support #4414

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Improve AMD ROCm Pipeline Performance & Update Dify Plugin for Hybrid Mode Support #4414

Uh oh!

ChenxiWu-Lab Jan 25, 2026

Replies: 1 comment · 3 replies

Uh oh!

myhloli Jan 26, 2026 Maintainer

Uh oh!

ChenxiWu-Lab Jan 26, 2026 Author

Uh oh!

Uh oh!

healy-hub Jan 26, 2026

Uh oh!

ChenxiWu-Lab Jan 26, 2026 Author

ChenxiWu-Lab
Jan 25, 2026

Replies: 1 comment 3 replies

myhloli
Jan 26, 2026
Maintainer

ChenxiWu-Lab Jan 26, 2026
Author

ChenxiWu-Lab Jan 26, 2026
Author