-
Notifications
You must be signed in to change notification settings - Fork 38
NewFeatureIdeas
Valentina edited this page Jan 12, 2025
·
18 revisions
New features:
- Extend the set of inference frameworks and models in regular benchmarks: TensorRT (Python API), ExecuTorch (Python API), JAX/Flax (for Google TPUs).
- Study of methods for predicting inference performance (using machine learning).
- Extend the set of hardware platforms for regular performance measurements (RISC-V, RaspberryPi 4 8GB) (if we will have access to hardware).
- Update GUI application to create configuration files for benchmarking.
- PyTorch quantization support.
- ONNX Runtime quantization support.
- Intel Extension for PyTorch (for iGPU) support.
- Enabling collection of layer-by-layer performance statistics for models. − Study the internal capabilities of frameworks (OpenVINO – Benchmark Tool, TVM – [Performance Application Programming Interface (PAPI)][https://tvm.apache.org/docs/how_to/profile/papi.html], PyTorch, ONNX Runtime, TensorFlow, TensorFlow Lite). − Provide support for the main frameworks whose output is supported in the benchmarking system.
- Implement Python wrappers for the C++ version of the benchmark (for example, using pybind11).
- An application (first a script) for visualizing the results of inference performance. − Input data: models, a set of frameworks, a set of batch sizes, other parameters. − Output data: inference performance graphs built using the matplotlib Python-package.