- optimize int8 implement on armv7/armv8.1
- optimize AutoKernel implement on x86
- fix the Float32 bugs of Vulkan
- support the mode type of PaddlePaddle
- support the mode type of OneFlow
- opensource the plugin implement of NPU (VeriSilicon NPU IP)
- opensource the plugin implement of CUDA
- opensource the plugin implement of TensorRT
- opensource the plugin implement of NNIE
- add more test case