1
1
OpenBLAS ChangeLog
2
+ ====================================================================
3
+ Version 0.3.12
4
+ 24-Oct-2020
5
+
6
+ common:
7
+ * Fixed missibg LAPACK functions (inadvertently dropped during
8
+ the build system restructuring)
9
+ * Fixed argument conversion macro in LAPACKE_zgesvdq (LAPACK #458)
10
+
11
+ POWER:
12
+ * Added optimized SCOPY/CCOPY kernels for POWER10
13
+ * Increased and unified the default size of the GEMM BUFFER
14
+ * Fixed building for POWER1ß in DYNAMIC_ARCH mode
15
+ * POWER10 compatibility test now checks binutils version as well
16
+ * Cleaned up compiler warnings
17
+
18
+ x86_64:
19
+ * corrected compiler version checks for AVX2 compatibility
20
+ * added compiler option -mavx2 for building with flang
21
+ * fixed direct SGEMM pathway for small matrix sizes (broken by
22
+ the code refactoring in 0.3.11)
23
+ * fixed unhandled partial register clobbers in several kernels
24
+ for AXPY,DOT,GEMV_N and GEMV_T flagged by gcc10 tree-vectorizer
25
+
26
+ ARMV8:
27
+ * improved Apple Vortex support to include cross-compiling
28
+
2
29
====================================================================
3
30
Version 0.3.11
4
31
17-Oct-2020
5
32
6
- common:
33
+ common:
7
34
* API change:
8
35
the newly added BFLOAT16 functions were renamed to use the
9
36
letter "B" instead of "H" to avoid potential confusion with
@@ -28,7 +55,7 @@ Version 0.3.11
28
55
* Makefile builds no longer misread NO_CBLAS=0 or NO_LAPACK=0 as
29
56
enabling these options
30
57
* Fixed detection of gfortran when invoked through an mpi wrapper
31
- * Improve thread reinitialization performance with OpenMP xafter a fork
58
+ * Improve thread reinitialization performance with OpenMP after a fork
32
59
* Added support for building only the subset of the library required
33
60
for a particular precision by specifying BUILD_SINGLE, BUILD_DOUBLE
34
61
* Optional function name prefixes and suffixes are now correctly
66
93
* Fixed cpu detection on BSD-like systems
67
94
* Fixed compilation in -std=C18 mode
68
95
69
-
70
96
IBM Z:
71
97
* Added support for compiling with the clang compiler
72
98
* Improved GEMM performance on Z14
0 commit comments