How to set multi handle on 1 single gpu to speedup inference? #8542

wnzhyee · 2021-07-29T07:02:02Z

wnzhyee
Jul 29, 2021

Hi, I'm using cpp onnxruntime-gpu to inference model. I defined a class include image input, onnx inference and get output results(e.g. class Model), and now i want to create multi Model handle ,like model1 = Model(), model2 = Model()... Now the problem is, the onnx session is created in the Model instance, when i try to create multi handle on one same GPU to use all gpu RAM and calculate core so i can inference images at same time to speed up throughputs, but it seems as slow as i use one handle, and inference them one by one. Is there any suggestions to make use of one gpu's all resources? thanks

xinsuinizhuan · 2021-11-18T10:24:47Z

xinsuinizhuan
Nov 18, 2021

same question. Multi is so slow now, still unsolved.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to set multi handle on 1 single gpu to speedup inference? #8542

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to set multi handle on 1 single gpu to speedup inference? #8542

Uh oh!

wnzhyee Jul 29, 2021

Replies: 1 comment

Uh oh!

xinsuinizhuan Nov 18, 2021

wnzhyee
Jul 29, 2021

xinsuinizhuan
Nov 18, 2021