-
Notifications
You must be signed in to change notification settings - Fork 100
Description
I want to use CUDA instead of CPU to increase the speed on tag inference.
My machine Ubuntu 22.04.3 LTS (GNU/Linux 6.5.0-35-generic x86_64), CUDA 12.2
I learned from https://onnxruntime.ai/docs/install/ that if you have cuda 12 must install using pip install onnxruntime-gpu --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/ as of time of writing, instead of simply pip install onnxruntime-gpu which is for cuda 11. This took me a while to figure out. Kept getting errors that didn't make sense:
[E:onnxruntime
, provider_bridge_ort.cc:1744 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1426 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.11: cannot open shared object file: No such file or directory
[W:onnxruntime, onnxruntime_pybind_state.cc:870 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirementsto ensure all dependencies are met.
I had those objects. but after reading carefully and reinstalling based on the above for cuda 12 it worked. Using CUDAExecutionprovider instead of CPUExecutionprovider however did cause a new warning:
[W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 12 Memcpy nodes are added to the graph main_graph for CUDAExecutionProvider. It might have negative impact on performance (including unable to run CUDA graph). Set session_options.log_severity_level=1 to see the detail logs before this message.
Basically bottlenecked by CPU/GPU data transfer. Trying to figure out but have not been able to successfully.