-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Description
I have searched the existing issues, both open and closed, to make sure this is not a duplicate report.
- Yes
The bug
I have over 30k photos and videos on my immich server, which has no gpu. So after I upload a bunch of fresh photos, I run immich-ml on my pc that has RTX 4070 Super via docker desktop and it crunches through face detection and smart search stuff in mere minutes. With OCR it runs like a turtle, so I left it running overnight.
When I came back 5 hours later, container ate all my resources:
- CPU: 99.97% (AMD Ryzen 5600X 6c/12t)
- RAM (container): 12.32GB / 15.58GB
- GPU VRAM: 11.7/12GB Dedicated, 15.5/16GB Shared, 27.2/28GB Total memory (ReBAR?)
You can see from logs that it all happened in just one hour
This behavior is not isolated to Docker desktop, WSL or CUDA. When immich switched to its internal ml server, it maxed out 48c/96t Xeon CPU and load has not dropped even after I cancelled OCR job and cleaned the queue. The only fix was to restart ml container.
Used model: PP-OCRv5_server (I will test mobile version and report its performance later UPD: PP-OCRv5_mobile doesn't have such issue)
Not related, but for some reason I couldn't run OCR in parallel (concurrency > 1), ml worker errors with
[E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running FusedConv node. Name:'Conv.0' Status Message: CUDA error cudaErrorStreamCaptureUnsupported:operation not permitted when stream is capturing
The OS that Immich Server is running on
Debian 13 (via docker compose)
Version of Immich Server
v2.2.0
Version of Immich Mobile App
irrelevant
Platform with the issue
- Server
- Web
- Mobile
Device make and model
No response
Your docker-compose.yml content
# ML worker compose content
name: immich
services:
immich-machine-learning:
container_name: immich_machine_learning
# For hardware acceleration, add one of -[armnn, cuda, rocm, openvino, rknn] to the image tag.
# Example tag: ${IMMICH_VERSION:-release}-cuda
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
extends: # uncomment this section for hardware acceleration - see https://docs.immich.app/features/ml-hardware-acceleration
file: hwaccel.ml.yml
service: cuda # set to one of [armnn, cuda, rocm, openvino, openvino-wsl, rknn] for accelerated inference - use the `-wsl` version for WSL2 where applicable
volumes:
- model-cache:/cache
environment:
- IMMICH_VERSION=v2
restart: always
healthcheck:
disable: false
ports:
- 3003:3003
volumes:
model-cache:Your .env content
IMMICH_VERSION=v2Reproduction steps
described above
Relevant log output
immich_machine_learning | [11/01/25 00:49:08] INFO Setting execution providers to
immich_machine_learning | ['CUDAExecutionProvider', 'CPUExecutionProvider'],
immich_machine_learning | in descending order of preference
immich_machine_learning | [11/01/25 00:49:08] INFO Using engine_name: onnxruntime
immich_machine_learning | 2025-11-01 01:39:54.138027723 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Concat node. Name:'Concat.16' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 3706060800
immich_machine_learning |
immich_machine_learning | [11/01/25 01:39:54] ERROR Exception in ASGI application
immich_machine_learning |
immich_machine_learning | ╭─────── Traceback (most recent call last) ───────╮
immich_machine_learning | │ /opt/venv/lib/python3.11/site-packages/rapidocr │
immich_machine_learning | │ /inference_engine/onnxruntime/main.py:90 in │
immich_machine_learning | │ __call__ │
immich_machine_learning | │ │
immich_machine_learning | │ 87 │ def __call__(self, input_content: np. │
immich_machine_learning | │ 88 │ │ input_dict = dict(zip(self.get_in │
immich_machine_learning | │ 89 │ │ try: │
immich_machine_learning | │ ❱ 90 │ │ │ return self.session.run(self. │
immich_machine_learning | │ 91 │ │ except Exception as e: │
immich_machine_learning | │ 92 │ │ │ error_info = traceback.format │
immich_machine_learning | │ 93 │ │ │ raise ONNXRuntimeError(error_ │
immich_machine_learning | │ │
immich_machine_learning | │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
immich_machine_learning | │ ime/capi/onnxruntime_inference_collection.py:22 │
immich_machine_learning | │ 0 in run │
immich_machine_learning | │ │
immich_machine_learning | │ 217 │ │ if not output_names: │
immich_machine_learning | │ 218 │ │ │ output_names = [output.name │
immich_machine_learning | │ 219 │ │ try: │
immich_machine_learning | │ ❱ 220 │ │ │ return self._sess.run(output │
immich_machine_learning | │ 221 │ │ except C.EPFail as err: │
immich_machine_learning | │ 222 │ │ │ if self._enable_fallback: │
immich_machine_learning | │ 223 │ │ │ │ print(f"EP Error: {err!s │
immich_machine_learning | ╰─────────────────────────────────────────────────╯
immich_machine_learning | RuntimeException: [ONNXRuntimeError] : 6 :
immich_machine_learning | RUNTIME_EXCEPTION : Non-zero status code returned
immich_machine_learning | while running Concat node. Name:'Concat.16' Status
immich_machine_learning | Message:
immich_machine_learning | /onnxruntime_src/onnxruntime/core/framework/bfc_are
immich_machine_learning | na.cc:376 void*
immich_machine_learning | onnxruntime::BFCArena::AllocateRawInternal(size_t,
immich_machine_learning | bool, onnxruntime::Stream*, bool,
immich_machine_learning | onnxruntime::WaitNotificationFn) Failed to allocate
immich_machine_learning | memory for requested buffer of size 3706060800
immich_machine_learning |
immich_machine_learning |
immich_machine_learning | The above exception was the direct cause of the
immich_machine_learning | following exception:
immich_machine_learning |
immich_machine_learning | ╭─────── Traceback (most recent call last) ───────╮
immich_machine_learning | │ /usr/src/immich_ml/main.py:177 in predict │
immich_machine_learning | │ │
immich_machine_learning | │ 174 │ │ inputs = text │
immich_machine_learning | │ 175 │ else: │
immich_machine_learning | │ 176 │ │ raise HTTPException(400, "Either │
immich_machine_learning | │ ❱ 177 │ response = await run_inference(inputs │
immich_machine_learning | │ 178 │ return ORJSONResponse(response) │
immich_machine_learning | │ 179 │
immich_machine_learning | │ 180 │
immich_machine_learning | │ │
immich_machine_learning | │ /usr/src/immich_ml/main.py:202 in run_inference │
immich_machine_learning | │ │
immich_machine_learning | │ 199 │ │ response[entry["task"]] = output │
immich_machine_learning | │ 200 │ │
immich_machine_learning | │ 201 │ without_deps, with_deps = entries │
immich_machine_learning | │ ❱ 202 │ await asyncio.gather(*[_run_inference │
immich_machine_learning | │ 203 │ if with_deps: │
immich_machine_learning | │ 204 │ │ await asyncio.gather(*[_run_infer │
immich_machine_learning | │ 205 │ if isinstance(payload, Image): │
immich_machine_learning | │ │
immich_machine_learning | │ /usr/src/immich_ml/main.py:197 in │
immich_machine_learning | │ _run_inference │
immich_machine_learning | │ │
immich_machine_learning | │ 194 │ │ │ │ message = f"Task {entry[' │
immich_machine_learning | │ output of {dep}" │
immich_machine_learning | │ 195 │ │ │ │ raise HTTPException(400, │
immich_machine_learning | │ 196 │ │ model = await load(model) │
immich_machine_learning | │ ❱ 197 │ │ output = await run(model.predict, │
immich_machine_learning | │ 198 │ │ outputs[model.identity] = output │
immich_machine_learning | │ 199 │ │ response[entry["task"]] = output │
immich_machine_learning | │ 200 │
immich_machine_learning | │ │
immich_machine_learning | │ /usr/src/immich_ml/main.py:215 in run │
immich_machine_learning | │ │
immich_machine_learning | │ 212 │ if thread_pool is None: │
immich_machine_learning | │ 213 │ │ return func(*args, **kwargs) │
immich_machine_learning | │ 214 │ partial_func = partial(func, *args, * │
immich_machine_learning | │ ❱ 215 │ return await asyncio.get_running_loop │
immich_machine_learning | │ 216 │
immich_machine_learning | │ 217 │
immich_machine_learning | │ 218 async def load(model: InferenceModel) -> │
immich_machine_learning | │ │
immich_machine_learning | │ /usr/local/lib/python3.11/concurrent/futures/th │
immich_machine_learning | │ read.py:58 in run │
immich_machine_learning | │ │
immich_machine_learning | │ /usr/src/immich_ml/models/base.py:60 in predict │
immich_machine_learning | │ │
immich_machine_learning | │ 57 │ │ self.load() │
immich_machine_learning | │ 58 │ │ if model_kwargs: │
immich_machine_learning | │ 59 │ │ │ self.configure(**model_kwargs │
immich_machine_learning | │ ❱ 60 │ │ return self._predict(*inputs) │
immich_machine_learning | │ 61 │ │
immich_machine_learning | │ 62 │ @abstractmethod │
immich_machine_learning | │ 63 │ def _predict(self, *inputs: Any, **mo │
immich_machine_learning | │ │
immich_machine_learning | │ /usr/src/immich_ml/models/ocr/detection.py:68 │
immich_machine_learning | │ in _predict │
immich_machine_learning | │ │
immich_machine_learning | │ 65 │ │ return session │
immich_machine_learning | │ 66 │ │
immich_machine_learning | │ 67 │ def _predict(self, inputs: bytes | Ima │
immich_machine_learning | │ ❱ 68 │ │ results = self.model(decode_cv2(in │
immich_machine_learning | │ 69 │ │ if results.boxes is None or result │
immich_machine_learning | │ 70 │ │ │ return self._empty │
immich_machine_learning | │ 71 │ │ return { │
immich_machine_learning | │ │
immich_machine_learning | │ /opt/venv/lib/python3.11/site-packages/rapidocr │
immich_machine_learning | │ /ch_ppocr_det/main.py:59 in __call__ │
immich_machine_learning | │ │
immich_machine_learning | │ 56 │ │ if prepro_img is None: │
immich_machine_learning | │ 57 │ │ │ return TextDetOutput() │
immich_machine_learning | │ 58 │ │ │
immich_machine_learning | │ ❱ 59 │ │ preds = self.session(prepro_img) │
immich_machine_learning | │ 60 │ │ boxes, scores = self.postprocess_ │
immich_machine_learning | │ 61 │ │ if len(boxes) < 1: │
immich_machine_learning | │ 62 │ │ │ return TextDetOutput() │
immich_machine_learning | │ │
immich_machine_learning | │ /opt/venv/lib/python3.11/site-packages/rapidocr │
immich_machine_learning | │ /inference_engine/onnxruntime/main.py:93 in │
immich_machine_learning | │ __call__ │
immich_machine_learning | │ │
immich_machine_learning | │ 90 │ │ │ return self.session.run(self. │
immich_machine_learning | │ 91 │ │ except Exception as e: │
immich_machine_learning | │ 92 │ │ │ error_info = traceback.format │
immich_machine_learning | │ ❱ 93 │ │ │ raise ONNXRuntimeError(error_ │
immich_machine_learning | │ 94 │ │
immich_machine_learning | │ 95 │ def get_input_names(self) -> List[str │
immich_machine_learning | │ 96 │ │ return [v.name for v in self.sess │
immich_machine_learning | ╰─────────────────────────────────────────────────╯
immich_machine_learning | ONNXRuntimeError: Traceback (most recent call
immich_machine_learning | last):
immich_machine_learning | File
immich_machine_learning | "/opt/venv/lib/python3.11/site-packages/rapidocr/in
immich_machine_learning | ference_engine/onnxruntime/main.py", line 90, in
immich_machine_learning | __call__
immich_machine_learning | return
immich_machine_learning | self.session.run(self.get_output_names(),
immich_machine_learning | input_dict)[0]
immich_machine_learning | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
immich_machine_learning | ^^^^^^^^^^^^^
immich_machine_learning | File
immich_machine_learning | "/opt/venv/lib/python3.11/site-packages/onnxruntime
immich_machine_learning | /capi/onnxruntime_inference_collection.py", line
immich_machine_learning | 220, in run
immich_machine_learning | return self._sess.run(output_names, input_feed,
immich_machine_learning | run_options)
immich_machine_learning | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
immich_machine_learning | ^^^^^^^^^^^^^
immich_machine_learning | onnxruntime.capi.onnxruntime_pybind11_state.Runtime
immich_machine_learning | Exception: [ONNXRuntimeError] : 6 :
immich_machine_learning | RUNTIME_EXCEPTION : Non-zero status code returned
immich_machine_learning | while running Concat node. Name:'Concat.16' Status
immich_machine_learning | Message:
immich_machine_learning | /onnxruntime_src/onnxruntime/core/framework/bfc_are
immich_machine_learning | na.cc:376 void*
immich_machine_learning | onnxruntime::BFCArena::AllocateRawInternal(size_t,
immich_machine_learning | bool, onnxruntime::Stream*, bool,
immich_machine_learning | onnxruntime::WaitNotificationFn) Failed to allocate
immich_machine_learning | memory for requested buffer of size 3706060800Additional information
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status