Immich + NVIDIA GPU for Video Transcoding and ML #8193

adityataps · 2025-10-08T05:21:17Z

adityataps
Oct 8, 2025

NVIDIA GPU setup for Immich transcoding and ML

This is a quick-and-dirty guide to get your NVIDIA GPU working with the Immich LXC, which can be used for video transcoding and ML features like facial recognition.

Warning

I'm just a hobbyist, and not in any way a developer on Immich. Please make backups of your containers and follow this guide at your own risk.

Prerequisites

LXC (Proxmox) running Immich >=v2.0.1 (Community Scripts).
- OS should be Debian 13 (Trixie).
- This guide is tested with an unprivileged LXC.
Working NVIDIA driver on host (nvidia-smi).
- Check this tutorial on setting up the GPU on the host.

NVIDIA GPU LXC passthrough

Important

If you have existing data in your Immich LXC, back up your container!

In /etc/pve/lxc/<CTID>.conf, add these lines:

lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file

Reboot the container using pct reboot <CTID>.

Locate the NVIDIA Linux driver that corresponds with the NVIDIA driver version on the host (as seen in nvidia-smi; for me, it was 550.163.01). Copy the URL of the .run file of the corresponding driver.

In the LXC, enter these commands:

wget https://download.nvidia.com/XFree86/Linux-x86_64/550.163.01/NVIDIA-Linux-x86_64-550.163.01.run # replace with your .run url
chmod +x NVIDIA-Linux-x86_64-550.163.01.run # replace with your .run file
./NVIDIA-Linux-x86_64-550.163.01.run --no-kernel-module # --no-kernel-module is important!

Reboot the container. After rebooting, you should now be able to run nvidia-smi from within the container:

root@immich-demo:~# nvidia-smi
Tue Oct  7 23:11:28 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.163.01             Driver Version: 550.163.01     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060        Off |   00000000:2B:00.0 Off |                  N/A |
|  0%   40C    P8             14W /  170W |       2MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
root@immich-demo:~#

Transcoding configuration

After following the previous steps for GPU passthrough, go to the Immich webapp and navigate to Administration > Video Transcoding Settings. Set "Acceleration API" to NVENC and save settings.

CUDA configuration for ML features

Note

For these steps I will be using CUDA 12.4. For your installation please refer to your
nvidia-smi output to determine which version of CUDA and related packages to install.

In the LXC terminal, open the Immich ML logs:
```
tail -f --lines 100 /var/log/immich/ml.log
```

In the Immich webapp, upload a new image. You should start to see some logs in the LXC terminal. (I enabled OpenVINO by mistake when setting up the demo LXC, so your logs might be different.)

You might see some logs like this:

[10/07/25 23:11:21] INFO     Application startup complete.                      
[10/07/25 23:30:47] INFO     Downloading detection model 'buffalo_l'. This may  
                             take a while.                                      
Fetching 4 files: 100%|██████████| 4/4 [00:06<00:00,  1.69s/it]
[10/07/25 23:30:57] INFO     Loading detection model 'buffalo_l' to memory      
[10/07/25 23:30:58] INFO     Setting execution providers to                     
                             ['OpenVINOExecutionProvider',                      
                             'CPUExecutionProvider'], in descending order of    
                             preference                                         
75 warnings generated.
75 warnings generated.
75 warnings generated.
75 warnings generated.
75 warnings generated.
75 warnings generated.
75 warnings generated.
[10/07/25 23:30:59] INFO     Downloading visual model 'ViT-B-32__openai'. This  
                             may take a while.                                  
75 warnings generated.
Fetching 11 files:   0%|          | 0/11 [00:00<?, ?it/s]77 warnings generated.
75 warnings generated.
Fetching 11 files:   9%|▉         | 1/11 [00:00<00:01,  8.34it/s]77 warnings generated.
75 warnings generated.
77 warnings generated.
75 warnings generated.
77 warnings generated.
2025-10-07 23:31:00.014956587 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running OpenVINO-EP-subgraph_1 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0' Status Message: /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:243 void onnxruntime::openvino_ep::OVInferRequest::WaitRequest() [OpenVINO-EP]  Wait Model Failed: Exception from src/inference/src/cpp/infer_request.cpp:245:
Check 'data_rank.is_dynamic() || filters_rank.is_dynamic() || data_shape[1].compatible(filters_shape[1])' failed at src/core/shape_inference/include/convolution_shape_inference_util.hpp:314:
While validating node 'opset1::Convolution Convolution_6646 () -> ()' with friendly_name 'Convolution_6646':
Data batch channel count (0) does not match filter input channel count (56).



75 warnings generated.
75 warnings generated.
[10/07/25 23:31:00] ERROR    Exception in ASGI application                      
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /opt/immich/app/machine-learning/immich_ml/main │
                             │ .py:177 in predict                              │
                             │                                                 │
                             │   174 │   │   inputs = text                     │
                             │   175 │   else:                                 │
                             │   176 │   │   raise HTTPException(400, "Either  │
                             │ ❱ 177 │   response = await run_inference(inputs │
                             │   178 │   return ORJSONResponse(response)       │
                             │   179                                           │
                             │   180                                           │
                             │                                                 │
                             │ /opt/immich/app/machine-learning/immich_ml/main │
                             │ .py:200 in run_inference                        │
                             │                                                 │
                             │   197 │   │   response[entry["task"]] = output  │
                             │   198 │                                         │
                             │   199 │   without_deps, with_deps = entries     │
                             │ ❱ 200 │   await asyncio.gather(*[_run_inference │
                             │   201 │   if with_deps:                         │
                             │   202 │   │   await asyncio.gather(*[_run_infer │
                             │   203 │   if isinstance(payload, Image):        │
                             │                                                 │
                             │ /opt/immich/app/machine-learning/immich_ml/main │
                             │ .py:195 in _run_inference                       │
                             │                                                 │
                             │   192 │   │   │   │   message = f"Task {entry[' │
                             │       output of {dep}"                          │
                             │   193 │   │   │   │   raise HTTPException(400,  │
                             │   194 │   │   model = await load(model)         │
                             │ ❱ 195 │   │   output = await run(model.predict, │
                             │   196 │   │   outputs[model.identity] = output  │
                             │   197 │   │   response[entry["task"]] = output  │
                             │   198                                           │
                             │                                                 │
                             │ /opt/immich/app/machine-learning/immich_ml/main │
                             │ .py:213 in run                                  │
                             │                                                 │
                             │   210 │   if thread_pool is None:               │
                             │   211 │   │   return func(*args, **kwargs)      │
                             │   212 │   partial_func = partial(func, *args, * │
                             │ ❱ 213 │   return await asyncio.get_running_loop │
                             │   214                                           │
                             │   215                                           │
                             │   216 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /opt/immich/.local/share/uv/python/cpython-3.11 │
                             │ .13-linux-x86_64-gnu/lib/python3.11/concurrent/ │
                             │ futures/thread.py:58 in run                     │
                             │                                                 │
                             │ /opt/immich/app/machine-learning/immich_ml/mode │
                             │ ls/base.py:61 in predict                        │
                             │                                                 │
                             │    58 │   │   self.load()                       │
                             │    59 │   │   if model_kwargs:                  │
                             │    60 │   │   │   self.configure(**model_kwargs │
                             │ ❱  61 │   │   return self._predict(*inputs, **m │
                             │    62 │                                         │
                             │    63 │   @abstractmethod                       │
                             │    64 │   def _predict(self, *inputs: Any, **mo │
                             │                                                 │
                             │ /opt/immich/app/machine-learning/immich_ml/mode │
                             │ ls/facial_recognition/detection.py:30 in        │
                             │ _predict                                        │
                             │                                                 │
                             │   27 │   def _predict(self, inputs: NDArray[np. │
                             │      FaceDetectionOutput:                       │
                             │   28 │   │   inputs = decode_cv2(inputs)        │
                             │   29 │   │                                      │
                             │ ❱ 30 │   │   bboxes, landmarks = self._detect(i │
                             │   31 │   │   return {                           │
                             │   32 │   │   │   "boxes": bboxes[:, :4].round() │
                             │   33 │   │   │   "scores": bboxes[:, 4],        │
                             │                                                 │
                             │ /opt/immich/app/machine-learning/immich_ml/mode │
                             │ ls/facial_recognition/detection.py:38 in        │
                             │ _detect                                         │
                             │                                                 │
                             │   35 │   │   }                                  │
                             │   36 │                                          │
                             │   37 │   def _detect(self, inputs: NDArray[np.u │
                             │      NDArray[np.float32]]:                      │
                             │ ❱ 38 │   │   return self.model.detect(inputs)   │
                             │   39 │                                          │
                             │   40 │   def configure(self, **kwargs: Any) ->  │
                             │   41 │   │   self.model.det_thresh = kwargs.pop │
                             │                                                 │
                             │ /opt/immich/app/machine-learning/ml-venv/lib/py │
                             │ thon3.11/site-packages/insightface/model_zoo/re │
                             │ tinaface.py:224 in detect                       │
                             │                                                 │
                             │   221 │   │   det_img = np.zeros( (input_size[1 │
                             │   222 │   │   det_img[:new_height, :new_width,  │
                             │   223 │   │                                     │
                             │ ❱ 224 │   │   scores_list, bboxes_list, kpss_li │
                             │   225 │   │                                     │
                             │   226 │   │   scores = np.vstack(scores_list)   │
                             │   227 │   │   scores_ravel = scores.ravel()     │
                             │                                                 │
                             │ /opt/immich/app/machine-learning/ml-venv/lib/py │
                             │ thon3.11/site-packages/insightface/model_zoo/re │
                             │ tinaface.py:152 in forward                      │
                             │                                                 │
                             │   149 │   │   kpss_list = []                    │
                             │   150 │   │   input_size = tuple(img.shape[0:2] │
                             │   151 │   │   blob = cv2.dnn.blobFromImage(img, │
                             │       (self.input_mean, self.input_mean, self.i │
                             │ ❱ 152 │   │   net_outs = self.session.run(self. │
                             │   153 │   │                                     │
                             │   154 │   │   input_height = blob.shape[2]      │
                             │   155 │   │   input_width = blob.shape[3]       │
                             │                                                 │
                             │ /opt/immich/app/machine-learning/immich_ml/sess │
                             │ ions/ort.py:49 in run                           │
                             │                                                 │
                             │    46 │   │   input_feed: dict[str, NDArray[np. │
                             │    47 │   │   run_options: Any = None,          │
                             │    48 │   ) -> list[NDArray[np.float32]]:       │
                             │ ❱  49 │   │   outputs: list[NDArray[np.float32] │
                             │       run_options)                              │
                             │    50 │   │   return outputs                    │
                             │    51 │                                         │
                             │    52 │   @property                             │
                             │                                                 │
                             │ /opt/immich/app/machine-learning/ml-venv/lib/py │
                             │ thon3.11/site-packages/onnxruntime/capi/onnxrun │
                             │ time_inference_collection.py:220 in run         │
                             │                                                 │
                             │    217 │   │   if not output_names:             │
                             │    218 │   │   │   output_names = [output.name  │
                             │    219 │   │   try:                             │
                             │ ❱  220 │   │   │   return self._sess.run(output │
                             │    221 │   │   except C.EPFail as err:          │
                             │    222 │   │   │   if self._enable_fallback:    │
                             │    223 │   │   │   │   print(f"EP Error: {err!s │
                             ╰─────────────────────────────────────────────────╯
                             Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero     
                             status code returned while running                 
                             OpenVINO-EP-subgraph_1 node.                       
                             Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgrap
                             h_1_0' Status Message:                             
                             /onnxruntime/onnxruntime/core/providers/openvino/ov
                             _interface.cc:243 void                             
                             onnxruntime::openvino_ep::OVInferRequest::WaitReque
                             st() [OpenVINO-EP]  Wait Model Failed: Exception   
                             from src/inference/src/cpp/infer_request.cpp:245:  
                             Check 'data_rank.is_dynamic() ||                   
                             filters_rank.is_dynamic() ||                       
                             data_shape[1].compatible(filters_shape[1])' failed 
                             at                                                 
                             src/core/shape_inference/include/convolution_shape_
                             inference_util.hpp:314:                            
                             While validating node 'opset1::Convolution         
                             Convolution_6646 () -> ()' with friendly_name      
                             'Convolution_6646':                                
                             Data batch channel count (0) does not match filter 
                             input channel count (56).

The key line is here:

[10/07/25 23:30:58] INFO     Setting execution providers to                     
                             ['OpenVINOExecutionProvider',                      
                             'CPUExecutionProvider'], in descending order of    
                             preference

We need the execution provider to be CUDAExecutionProvider, and to do that, we need onnxruntime to be able to detect our NVIDIA GPU.

Stop the Immich services:
```
systemctl stop immich-ml immich-web
```

Activate the ml-venv uv virtual environment and use uv pip list to find the currently installed onnx and
cuda/cudnn runtime DLLs:

cd /opt/immich/app/machine-learning/
source ml-venv/bin/activate
uv pip list | grep 'onnx\|cuda\|cudnn'

This is my output:

Using Python 3.11.13 environment at: ml-venv
onnx 1.16.0
onnxruntime-openvino 1.18.0

If you don't see onnxruntime-gpu or any cuda packages, install the onnxruntime-gpu package with cuda and cudnn extras (while the ml-venv virtualenv is still active):

uv pip install onnxruntime-gpu[cuda,cudnn]

You should now be able to see the installed onnxruntime-gpu package along with the relevant cuda and cudnn runtime DLLs:

(immich-ml) root@immich:~# uv pip list | grep 'onnx\|cuda\|cudnn'
Using Python 3.11.13 environment at: /opt/immich/app/machine-learning/ml-venv
nvidia-cuda-nvrtc-cu12   12.9.86
nvidia-cuda-runtime-cu12 12.9.79
nvidia-cudnn-cu12        9.13.1.26
onnx                     1.16.0
onnxruntime-gpu          1.23.0
onnxruntime-openvino     1.18.0

Tip

Refer to the onnxruntime compatability matrix for the compatible versions of cuda and cudnn.

Install CUDA Toolkit for your expected CUDA version, as seen on nvidia-smi. For me, this was 12.4. These instructions can also be found on the NVIDIA website. Installing the keyring:
```
wget https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo add-apt-repository contrib
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-4
```
If the above doesn't work (add-apt-repository not found, Depends: libtinfo5 but it is not installable, etc.), try installing the libraries individually.
```
sudo apt-get update
sudo apt-get install \
 cuda-toolkit-12-4-config-common \
 cuda-nvcc-12-4 \
 cuda-cudart-dev-12-4 \
 cuda-libraries-dev-12-4
```
After installing verify your CUDA Toolkit installation by updating the PATH and LD_LIBRARY_PATH variables and verifying your output from nvcc --version. More post-installation instructions can be found here.
```
export PATH=${PATH}:/usr/local/cuda-12.4/bin
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-12.4/lib64
nvcc --version
```

To update PATH and LD_LIBRARY_PATH to point to cuda and cudnn in the ML systemd service, add these lines to /etc/systemd/system/immich-ml.service, under the [Service] block:

Environment="PATH=/usr/local/cuda-12.4/bin"
Environment="LD_LIBRARY_PATH=/usr/local/cuda-12.4/lib64:/opt/immich/app/machine-learning/ml-venv/lib/python3.11/site-packages/nvidia/curand/lib:/opt/immich/app/machine-learning/ml-venv/lib/python3.11/site-packages/nvidia/cufft/lib:/opt/immich/app/machine-learning/ml-venv/lib/python3.11/site-packages/nvidia/nvjitlink/lib:/opt/immich/app/machine-learning/ml-venv/lib/python3.11/site-packages/nvidia/cuda_runtime/lib:/opt/immich/app/machine-learning/ml-venv/lib/python3.11/site-packages/nvidia/cudnn/lib:/opt/immich/app/machine-learning/ml-venv/lib/python3.11/site-packages/nvidia/cuda_nvrtc/lib:/opt/immich/app/machine-learning/ml-venv/lib/python3.11/site-packages/nvidia/cublas/lib"

Note

Your paths may look a little different depending on what CUDA version you're running. To figure out your
LD_LIBRARY_PATH value, you can run

{ echo "/usr/local/cuda-12.4/lib64"; find /opt/immich/app/machine-learning/ml-venv/lib/python3.11/site-packages/nvidia -type d -name "lib"; } | paste -sd:

Run systemctl daemon-reload and systemctl start immich-ml immich-web, or reboot the container. Open the logs again (tail -f --lines 100 /var/log/immich/ml.log) and upload an image again from the webapp. You should now see logs like this:

[10/08/25 00:42:52] INFO     Starting gunicorn 23.0.0                           
[10/08/25 00:42:52] INFO     Listening at: http://[::]:3003 (1896)              
[10/08/25 00:42:52] INFO     Using worker: immich_ml.config.CustomUvicornWorker 
[10/08/25 00:42:52] INFO     Booting worker with pid: 1897                      
[10/08/25 00:42:53] INFO     Started server process [1897]                      
[10/08/25 00:42:53] INFO     Waiting for application startup.                   
[10/08/25 00:42:53] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[10/08/25 00:42:53] INFO     Initialized request thread pool with 4 threads.    
[10/08/25 00:42:53] INFO     Application startup complete.                      
[10/08/25 00:43:32] INFO     Loading detection model 'buffalo_l' to memory      
[10/08/25 00:43:32] INFO     Setting execution providers to                     
                             ['CUDAExecutionProvider', 'CPUExecutionProvider'], 
                             in descending order of preference                  
[10/08/25 00:43:33] INFO     Loading visual model 'ViT-B-32__openai' to memory  
[10/08/25 00:43:33] INFO     Setting execution providers to                     
                             ['CUDAExecutionProvider', 'CPUExecutionProvider'], 
                             in descending order of preference                  
[10/08/25 00:43:33] INFO     Loading recognition model 'buffalo_l' to memory    
[10/08/25 00:43:33] INFO     Setting execution providers to                     
                             ['CUDAExecutionProvider', 'CPUExecutionProvider'], 
                             in descending order of preference

If you see CUDAExecutionProvider and no errors, congratulations! You just set up your NVIDIA GPU with Immich for ML.

To verify that the GPU is being used, you can run nvidia-smi on the host and check the active processes while uploading images and videos to Immich:

> # nvidia-smi                                                                                                     
Tue Oct  7 02:23:51 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.163.01             Driver Version: 550.163.01     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060        On  |   00000000:2B:00.0 Off |                  N/A |
|  0%   50C    P8             14W /  170W |    1548MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A   1577250      C   python                                       1542MiB |
+-----------------------------------------------------------------------------------------+

Troubleshooting

Here are a few quick checks and common fixes for ML/GPU setup issues.

No `CUDAExecutionProvider` in the execution provider list (`/var/log/immich/ml.log`)

Tip

Quick checks:

Ensure onnxruntime-gpu (with CUDA and cuDNN extras) is installed inside the Immich ML venv.

source /opt/immich/app/machine-learning/ml-venv/bin/activate
uv pip list | grep 'onnx\|cuda\|cudnn'

Confirm the ML logs show CUDAExecutionProvider after restarting the service:

tail -f --lines 100 /var/log/immich/ml.log

If CUDA isn’t listed, refer to the official onnxruntime-gpu documentation for required CUDA/cuDNN versions and setup:
https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#cuda-execution-provider

Failed to load library (`/var/log/immich/ml.log`)

Important

This usually indicates missing or incorrect library paths. Verify your environment variables in /etc/systemd/system/immich-ml.service under the [Service] section:

PATH includes your CUDA bin directory (for example, /usr/local/cuda-12.x/bin).
LD_LIBRARY_PATH includes the CUDA/cuDNN library directories from the ML venv.

After making changes, reload and restart the service:

systemctl daemon-reload
systemctl restart immich-ml

If the problem persists, check recent service logs for detailed errors:

journalctl -u immich-ml -n 200 --no-pager

Specify the number of threads explicitly so the affinity is not set. (`/var/log/immich/ml.log`)

Check this: #8193 (reply in thread)

One other possible fix is to check
nvidia-smi on the host and ensure that the GPU is not being currently used by any other LXC.

No space left on device

Increase storage of the LXC by running this command on the PVE host:

pve resize <CTID> rootfs +<storage to add>

For example, to increase LXC 100 storage by 8GB:

pve resize 100 rootfs +8G

Notes

It would be really neat to have Community Scripts ask and configure the
onnxruntime-gpu package with user-selected extras for GPU-supported ML features when initializing the LXC.

Revisions

Edited for formatting, minor corrections
(Oct 11): Added steps for installing cuda-toolkit, updated troubleshooting

vhsdream · 2025-10-09T12:03:06Z

vhsdream
Oct 9, 2025
Collaborator

Hey this is great - thanks for making this guide! I happen to have a NVIDIA GPU (but I find using my Intel iGPU is more efficient) so I'm going to test this ann report back any issues. As for adding the option to have this configured during installation, I'm all for having it added as long as it is shown to work without too much fuss.

0 replies

lenne0815 · 2025-10-09T14:02:59Z

lenne0815
Oct 9, 2025

Awesome guide ! I did all the steps you described and got until the point of actually running the ML, unfortunately
I now get "error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set" when immich is trying to start to use the gpu, do you have any idea where i can fix that ?

15 replies

adityataps Oct 16, 2025
Author

Sorry for the delay, just now coming back to this. What does the output of these commands look like?

export PATH="/root/.local/bin:/usr/local/cuda-12.4/bin:$PATH"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-12.4/lib64:/opt/immich/app/machine-learning/ml-venv/lib/python3.11/site-packages/nvidia/cudnn/lib"
nvcc --version

Do you see something like this?

root@immich:~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0

lenne0815 Oct 17, 2025

Yes i saw that, i pretty much ended up where the other guy got stuck too with some cuda errors. I reinstalled immich for the millionth time and let it run with cpu based ml and to my great surprise it only took like three hours to scan 40k pictures.

adityataps Oct 21, 2025
Author

Sorry I couldn't help more :/ maybe it'll be a little more streamlined in the future

dhenry437 Oct 28, 2025

@adityataps I think the issue is with Environment="PATH=/usr/local/cuda-12.8/lib64" (I'm on 12.8). Setting the PATH like this means that the service doesn't have access to bash anymore, resulting in env: ‘bash’: No such file or directory.

I dropped the Environment="PATH=/usr/local/cuda-12.8/lib64" and left just the Environment="LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64: ... and it seems to be running fine.

I'm no linux expert so I could be wrong. Could you paste your /etc/systemd/system/immich-ml.service for reference?

hmronline Oct 29, 2025

Had the same issue and solved it the same way... working fine so far!

vhsdream · 2025-10-17T16:11:11Z

vhsdream
Oct 17, 2025
Collaborator

I've been testing this a bit in the past few days to see if it might be possible to add this to the script, but haven't been able to get it to work either. After I fixed some issues I had with the LD_LIBRARY_PATH etc, I ended up with a 'CUDNN_STATUS_EXECUTION_FAILED' error when trying to run.

Then I noticed this issue that was opened yesterday in the Immich repo, so CUDA ml may be broken on v2.1 for some Nvidia cards. Mine is a GTX1060 6GB.

0 replies

vhsdream · 2025-10-21T01:33:59Z

vhsdream
Oct 21, 2025
Collaborator

Update: Seems like the Immich team have discovered that Pascal and Maxwell Nvidia cards are not supported by cuDNN 9.11+, so they're going to pin theirs to 9.10.

With that knowledge I can get back to testing; as long as I install cuDNN 9.10 I should be good.

0 replies

devdecrux · 2025-10-25T19:30:22Z

devdecrux
Oct 25, 2025

I am also trying to follow this guide as of now to setup transcoding. Thanks!
However, I am also looking forward this to be added to the script.

@vhsdream
if you need some support I am happy to help. I have a 1650 ti

0 replies

hmronline · 2025-10-29T23:33:57Z

hmronline
Oct 29, 2025

Thanks to this guide I now have a properly working immich setup, please consider the notes added by the author:

It would be really neat to have Community Scripts ask and configure the
onnxruntime-gpu package with user-selected extras for GPU-supported ML features when initializing the LXC.

0 replies

mattague · 2025-11-02T22:10:33Z

mattague
Nov 2, 2025

Would anyone know what the odds are that this being merged to add support for cuda 13.x in onnxruntime means cuda 13.x support for immich_ml?

0 replies

rafaelfmuniz · 2025-11-08T08:32:06Z

rafaelfmuniz
Nov 8, 2025

I'd like to share my repository proxmox-nvidia-lxc - an automated script to configure NVIDIA GPU support in Proxmox LXC containers.

Key Features:

✅ Fully automated NVIDIA GPU setup for LXC

✅ Supports Proxmox VE 8.x & 9.x

✅ Works with NVIDIA drivers (470, 525, 535, 550+)

✅ Compatible with privileged/unprivileged containers

✅ Automatic GPU detection and configuration

Perfect for: ML workloads, media transcoding, CUDA applications, and AI processing in containers.

The script eliminates manual configuration errors and saves hours of setup time. Contributions and feedback are welcome!

Quick install:

# Download the script
wget https://raw.githubusercontent.com/rafaelfmuniz/proxmox-nvidia-lxc/main/nvidia-lxc-setup.sh

# Make it executable
chmod +x nvidia-lxc-setup.sh

# Run the script
./nvidia-lxc-setup.sh

1 reply

MickLesk Nov 8, 2025
Maintainer

Are you interested in pushing this to ProxmoxVED (Dev Repo)? I've made a lot of core changes there. I started with Nvidia, but due to a lack of GPU, it's all just “blind.”

Uh oh!

Immich + NVIDIA GPU for Video Transcoding and ML #8193

Uh oh!

Uh oh!

NVIDIA GPU setup for Immich transcoding and ML

Prerequisites

NVIDIA GPU LXC passthrough

Transcoding configuration

CUDA configuration for ML features

Troubleshooting

No CUDAExecutionProvider in the execution provider list (/var/log/immich/ml.log)

Failed to load library (/var/log/immich/ml.log)

Specify the number of threads explicitly so the affinity is not set. (/var/log/immich/ml.log)

No space left on device

Notes

Revisions

Replies: 8 comments · 16 replies

Uh oh!

vhsdream Oct 9, 2025 Collaborator

Uh oh!

Uh oh!

adityataps Oct 16, 2025 Author

Uh oh!

Uh oh!

adityataps Oct 21, 2025 Author

Uh oh!

Uh oh!

Uh oh!

vhsdream Oct 17, 2025 Collaborator

Uh oh!

vhsdream Oct 21, 2025 Collaborator

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MickLesk Nov 8, 2025 Maintainer

No `CUDAExecutionProvider` in the execution provider list (`/var/log/immich/ml.log`)

Failed to load library (`/var/log/immich/ml.log`)

Specify the number of threads explicitly so the affinity is not set. (`/var/log/immich/ml.log`)

Replies: 8 comments 16 replies

vhsdream
Oct 9, 2025
Collaborator

adityataps Oct 16, 2025
Author

adityataps Oct 21, 2025
Author

vhsdream
Oct 17, 2025
Collaborator

vhsdream
Oct 21, 2025
Collaborator

MickLesk Nov 8, 2025
Maintainer