v25.01
·
4840 commits
to main
since this release
v25.01 Public Major Release
Feat
- Add KleidiAI as third_party module
- Add NHWC FP16 kernels in CpuDirectConv
- Add support of all non-quantized data types for NEScatter
- Implement NEScatter for FP32 for all size configurations for Add/Sub/Min/Max/Update
- Add option to print time used by each iteration in the validation suite
- Support multi ISA build for macOS
Fix
- Performance regression in NEDeconvolutionLayer
- Performance regression in NEConvolutionLayer
- Usages of dynamic shapes in the library
- Use separate build flags for C and C++ for CMake
- Compiler error with gcc14 in 3rd party header stb_image
- Werror=noexcept compilation issue in NEScatter
- Unused tolerance_f16 in non-F16 builds
- SegFault in SME Softmax Int8 tests
- Disable pre-commit copyright validation for outside contributions
- SME2 interleaved s8 x s8 = f32 kernel mismatches
- Invalidate Bf16 Softmax when FEAT_SVE is not present and fix the tests
- Illegal instruction caused by SVE instruction outside streaming mode
- SME Winograd output transform 4x4_3x3 kernel
- Misspell in SConstruct:301: 'estate' to 'arch'
Refactor
- Removed deprecated NCHW kernels from CpuDirectConv2d
- Check pre-commit copyright, Android.bp and formatting separately
Perf
- Choose latest Gpu if Gpu name is not recognized and alter GEMM heuristics
Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v25.01/index.xhtml