How does provider inclusion and loading actually work? #18242
Replies: 1 comment
-
It seems that tensorRT provider implies both building and using the cuda provider. There is no mention of this in the build instructions which further complicates knowing what to deliver to end users. I conclude this from the fact that build.py produces a cuda provider dll without being asked to do so and from above error message when I don't copy it to the directory where our application runs. This means that the first conclusion may be wrong, maybe it is just a dependency between the TensorRT and CUDA provider pair. As much of the motivation to go to TensorRT was that CUDA has these enormously large cuDNN libraries we don't want to redistribute it was a bit of a turnoff that TensorRT implies CUDA provider, so it would have been nice if this was mentioned in the documentation. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
We are delivering image processing solutions based on OpenCL which now include some AI based on onnxruntime. We want our solutions to continue working seamlessly on any GPU that the end user may have. This has proven very challenging even though onnxruntime's main offering is to run on any hardware for which there is a provider.
The first issue is that there are no binaries to download which contain providers for different hardware. At first I could not wrap my head around why this was, it seems like such a no-brainer to just compile onnxruntime.dll with all providers enabled so you get all the choices.
Now, after a lot of struggle to build stuff on our own I finally managed to get a windows build with both TensorRT and CUDA. This lead to the really astonishing discovery that onnxruntime.dll tries to load the provider dlls for providers you never asked for, so if you enable for instance both TensorRT, MiGraphX and OpenVino when building onnxruntime ALL of the dependent dlls for ALL of the providers will be loaded into memory even if you only call for instance
AppendExecutionProvider_TensorRT
after detecting that there is a NVIDIA GPU in the computer. This loading happens inside the Session constructor, in our case as there was some dependent dll to onnxruntime_provider_cuda.dll that was missing. The library elects to output to cout or cerr on its own, which is also unacceptible:2023-11-02 10:40:09.1099128 [E:onnxruntime:ORTEngine0, provider_bridge_ort.cc:1492 onnxruntime::TryGetProviderInfo_CUDA] C:\gl\thirdpartylibs\onnxruntime\onnxruntime\core\session\provider_bridge_ort.cc:1195 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "C:\build\honey\bin\Debug\onnxruntime_providers_cuda.dll"
To me it is obvious that this call should never happen if you haven't called
AppendExecutionProvider_CUDA
so why does it? Is this a design decision to load all dlls even if explicitly told to NOT use them? Is it a bug?If this is a (bad) design decision it explains why there are no binaries supporting more than one GPU provider. It seems a very odd decision though as you take the trouble to load the libraries explicitly anyway so it should be very easy to avoid touching the libraries of providers not asked for.
Beta Was this translation helpful? Give feedback.
All reactions