Ideas how to use less VRAM for onnx models? #14472
Unanswered
elephantpanda
asked this question in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have some onnx models loaded into the GPU with InferenceSession("model.onnx") in DirectML mode.
All my 3 onnx files come to a total of 1.77GB. (float16)
It is using about 2.5GB on the GPU to load in the models.
And a further 5.2 GB once the models have run their first inference.
Overall taking 8.2GB of GPU space.
Now the most common VRAM people have is 8GB. So I need to get about 10-20% saving of GPU.
Anyone have any tips how to decrease my GPU size a bit without sacrificing to much speed?
Beta Was this translation helpful? Give feedback.
All reactions