Not improvement in CPU inference speed of static quantized model #17814
Unanswered
albertofernandezvillan
asked this question in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Tested static quantization as my model is a CNN model. I tried with different parameters for configuring the static quantization, but none of them seem to improve speed for CPU inference. Exploring the quantized model seems OK. Just to note, for inference, I am not using any Execution providers, just using OpenVino for inference, because in production machines we have configured OpenVino for inference other models.
Code is more or less as follows:
My questions are:
Beta Was this translation helpful? Give feedback.
All reactions