Replies: 1 comment
-
Hey, Maybe you can use the binding https://onnxruntime.ai/docs/api/c/struct_ort_api.html#a9a53edebf4ef062a41b0e74f9c6763ec |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have a yolov4 onnx model and due to 2 dynamic axis - batch and number of boxes I am unable to do batch inference while single example inference works.
Model Input/Output:
So is there a way to copy whole input buffer to device, run inference on slices of it in loop and then copy the output so as to maximize the throughput
Eg. Array [40,512,512,3] -> CopyToGPU -> Loop(inference on [1,512,512,3]) -> CopyOutputToCPU
Beta Was this translation helpful? Give feedback.
All reactions