v25.02
·
4841 commits
to main
since this release
v25.02 Public Major Release
Feat
- Detect number of CPU cores in OpenBSD
- Support tensors with dynamic shapes in NEGEMM
- Support FP16 dequantization in NEGEMMLowpMatrixMultiplyCore
- Add a public API for CpuMeanStdDevNormalization
- Enable BF16 inputs in CpuFullyConnected
Fix
- Linking errors in C++17 while compiling with clang
- False positive compiler warning stringop-overflow
- Redundant declaration warning of constexpr static data member (in C++17)
- Make GemmLowp return an error in validate when F16 is not supported
- Reorder interleave_by in CpuGemmAssemblyDispatch test code
- Gemm_hybrid_quantized.hpp was passing incorrect K size to the kernel
- Wrong kernel choice in CpuMul when build does not have SME2
- Incorrect scheduling hint heuristic for GEMMs
- Incorrect trademark usage in Readme for Arm(R)-Neoverse(TM) core
Refactor
- Use operator API inside NEMeanstdDevNormalizationLayer
Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v25.02/index.xhtml