Looking for ways to get model information like GPU memory usage #15294

GuanLuo · 2023-03-31T01:02:30Z

GuanLuo
Mar 31, 2023

Our application requires serving multiple models, in cases we can't load all models into memory, we would want to load as much as possible and later on unload / load model on demand. For this case, it would be much helpful if we can collect GPU memory usage of each ORT session created, to make better decision on sessions to be destroyed. I wonder if ORT keeps track of the memory usage at session level and what are the APIs that it exposes.

pranavsharma · 2023-03-31T04:42:46Z

pranavsharma
Mar 31, 2023

We don't track memory usage at a session level for inferencing builds. Is this for Triton? I'll add this as a feature request.

2 replies

GuanLuo Mar 31, 2023
Author

Yes, this is for Triton.

We don't track memory usage at a session level for inferencing builds.

Are there other kinds of builds?

pranavsharma Apr 1, 2023

We do this for training builds, but it has not been tested for inferencing since (the last I checked) there might be some thread safety issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Looking for ways to get model information like GPU memory usage #15294

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Looking for ways to get model information like GPU memory usage #15294

Uh oh!

GuanLuo Mar 31, 2023

Replies: 1 comment · 2 replies

Uh oh!

pranavsharma Mar 31, 2023

Uh oh!

GuanLuo Mar 31, 2023 Author

Uh oh!

pranavsharma Apr 1, 2023

GuanLuo
Mar 31, 2023

Replies: 1 comment 2 replies

pranavsharma
Mar 31, 2023

GuanLuo Mar 31, 2023
Author