Why onnxruntime.capi.onnxruntime_inference_collection.InferenceSession.run() takes a lot of time for the first time on a new machine? #11065

AnkushRR · 2022-03-31T07:24:48Z

AnkushRR
Mar 31, 2022

Hi, we have a gpu (g4dn.xl) ec2 instance set-up to run our face recognition ai model. Running inference session usually takes 1-2 seconds for an image but whenever we launch a new instance with an ami of existing instance, for the first time when inference session runs, it takes around 150 seconds. but from the second time onwards it again takes 1-2 seconds. We want to understand why it happens only for the first time. We are working on autoscaling for our gpu instances and this thing affects start-up time for new instances. please help us understand this.

PS: I tried replacing the model with optimized model by running a script from modelOptimizationSaveToPath. It did reduce the session.run() time by 10 seconds. It now takes 140 seconds but still it is much higher compared to it's consecutive session.run() time. Please help.

jcwchen · 2022-03-31T23:01:02Z

jcwchen
Mar 31, 2022

Similar to #10608, which might be related to your GPU card? Possible reason is described here.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why onnxruntime.capi.onnxruntime_inference_collection.InferenceSession.run() takes a lot of time for the first time on a new machine? #11065

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why onnxruntime.capi.onnxruntime_inference_collection.InferenceSession.run() takes a lot of time for the first time on a new machine? #11065

Uh oh!

AnkushRR Mar 31, 2022

Replies: 1 comment

Uh oh!

jcwchen Mar 31, 2022

AnkushRR
Mar 31, 2022

jcwchen
Mar 31, 2022