[feat] Add support for CPU only inferencing

CPU only inferencing is becoming a viable option for small quantized models with features such as [Intel AMX and VNNI](https://www.intel.com/content/www/us/en/developer/articles/technical/boost-language-models-with-pytorch-on-xeon.html).

It would be nice to see how a inference engine can be tuned to get the maximal performance out of a CPU without requiring a GPU for small models.