I am a developer on the numba-dpex and numba-mlir projects that is extending the Numba compiler to support auto-parallelization on different types of devices starting with current Intel CPU and GPU devices, but in future even non-Intel devices that are supported by SYCL.
I wanted to reach out to you to see if you are using Numba's CPU auto-parallelization options. A cursory search of prange or parallel=True jit option did not turn up much.
My goal is to understand potential uses cases that will drive the design of our proposed user API and also the numba-mlir compiler back end that we are developing as part of a POC to port Numba's code lowering to MLIR.