|
1 | 1 | OpenBLAS ChangeLog
|
| 2 | +==================================================================== |
| 3 | +Version 0.3.16 |
| 4 | + 11-Jul-2021 |
| 5 | + |
| 6 | +common: |
| 7 | + - drastically reduced the stack size requirements for running the LAPACK |
| 8 | + testsuite (Reference-LAPACK PR 553) |
| 9 | + - fixed spurious test failures in the LAPACK testsuite (Reference-LAPACK |
| 10 | + PR 564) |
| 11 | + - expressly setting DYNAMIC_ARCH=0 no longer enables dynamic_arch mode |
| 12 | + - improved performance of xGER, xSPR, xSPR2, xSYR, xSYR2, xTRSV, SGEMV_N |
| 13 | + and DGEMV_N, for small input sizes and consecutive arguments |
| 14 | + - improved performance of xGETRF, xPORTF and xPOTRI for small input sizes |
| 15 | + by disabling multithreading |
| 16 | + - fixed installing with BSD versions of the "install" utility |
| 17 | + |
| 18 | +RISCV: |
| 19 | + - fixed the implementation of xIMIN |
| 20 | + - improved the performance of DSDOT |
| 21 | + - fixed linking of the tests on C910V with current vendor gcc |
| 22 | + |
| 23 | +POWER: |
| 24 | +- fixed SBGEMM computation for some odd value inputs |
| 25 | +- fixed compilation for PPCG4, PPC970, POWER3, POWER4 and POWER5 |
| 26 | + |
| 27 | +x86_64: |
| 28 | + - improved performance of SGEMV_N and SGEMV_T for small N on AVX512-capable cpus |
| 29 | + - worked around a miscompilation of ZGEMM/ZTRMM on Sandybridge with old gcc |
| 30 | + versions |
| 31 | + - fixed compilation with MS Visual Studio versions older than 2017 |
| 32 | + - fixed macro name collision with winnt.h from the latest Win10 SDK |
| 33 | + - added cpu type autodetection for Intel Ice Lake SP |
| 34 | + - fixed cpu type autodetection for Intel Tiger Lake |
| 35 | + - added cpu type autodetection for recent Centaur/Zhaoxin models |
| 36 | + - fixed compilation with musl libc |
| 37 | + |
| 38 | +ARM64: |
| 39 | +- fixed compilation with gcc/gfortran on the Apple M1 |
| 40 | +- fixed linking of the tests on FreeBSD |
| 41 | +- fixed missing restore of a register in the recently rewritten DNRM2 kernel |
| 42 | + for ThunderX2 and Neoverse N1 that could cause spurious failures in e.g. |
| 43 | + DGEEV |
| 44 | +- added compiler optimization flags for the EMAG8180 |
| 45 | +- added initial support for Cortex A55 |
| 46 | + |
| 47 | +ARM: |
| 48 | +- fixed linking of the tests on FreeBSD |
| 49 | + |
2 | 50 | ====================================================================
|
3 | 51 | Version 0.3.15
|
4 | 52 | 2-May-2021
|
|
0 commit comments