You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Single GPU memory optimization (reduce if VRAM insufficient)
35
-
# --cache-max-entry-count 0.5 # Try 0.4 or lower if issues persist
16
+
# --data-parallel-size 2 # If using multiple GPUs, increase throughput using vllm's multi-GPU parallel mode
17
+
# --gpu-memory-utilization 0.5 # If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by this parameter, if VRAM issues persist, try lowering it further to `0.4` or below.
36
18
ulimits:
37
19
memlock: -1
38
20
stack: 67108864
@@ -58,21 +40,11 @@ services:
58
40
MINERU_MODEL_SOURCE: local
59
41
entrypoint: mineru-api
60
42
command:
61
-
# ==================== Server Configuration ====================
# --cache-max-entry-count 0.5 # Try 0.4 or lower if VRAM insufficient
45
+
# parameters for vllm-engine
46
+
# --data-parallel-size 2 # If using multiple GPUs, increase throughput using vllm's multi-GPU parallel mode
47
+
# --gpu-memory-utilization 0.5 # If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by this parameter, if VRAM issues persist, try lowering it further to `0.4` or below.
76
48
ulimits:
77
49
memlock: -1
78
50
stack: 67108864
@@ -96,30 +68,14 @@ services:
96
68
MINERU_MODEL_SOURCE: local
97
69
entrypoint: mineru-gradio
98
70
command:
99
-
# ==================== Gradio Server Configuration ====================
# WARNING: Only ONE engine can be enabled at a time!
109
-
110
-
# Option 1: vLLM Engine (recommended for most users)
111
-
--enable-vllm-engine true
112
-
# Multi-GPU configuration
113
-
# --data-parallel-size 2
114
-
# Single GPU memory optimization
115
-
# --gpu-memory-utilization 0.5 # Try 0.4 or lower if VRAM insufficient
116
-
117
-
# Option 2: LMDeploy Engine
118
-
# --enable-lmdeploy-engine true
119
-
# Multi-GPU configuration
120
-
# --dp 2
121
-
# Single GPU memory optimization
122
-
# --cache-max-entry-count 0.5 # Try 0.4 or lower if VRAM insufficient
73
+
--enable-vllm-engine true # Enable the vllm engine for Gradio
74
+
# --enable-api false # If you want to disable the API, set this to false
75
+
# --max-convert-pages 20 # If you want to limit the number of pages for conversion, set this to a specific number
76
+
# parameters for vllm-engine
77
+
# --data-parallel-size 2 # If using multiple GPUs, increase throughput using vllm's multi-GPU parallel mode
78
+
# --gpu-memory-utilization 0.5 # If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by this parameter, if VRAM issues persist, try lowering it further to `0.4` or below.
0 commit comments