Commit 2c737bc
committed
[SYSTEMDS-3896] Improved SIMD Vectorized Counting NNZ
This patch makes an additional performance improvement which further
reduced the runtime on an 8GB matrix from 850ms to 770ms
(non-vectorized 1100) by avoiding unnecessary scalar ops. Furthermore,
we fix the hard-coded AVX512 vector size to the general vector length
(which failed on non-Intel hardware in gitactions).1 parent 7b34a67 commit 2c737bc
File tree
1 file changed
+6
-4
lines changed- src/main/java/org/apache/sysds/runtime/util
1 file changed
+6
-4
lines changedLines changed: 6 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
880 | 880 | | |
881 | 881 | | |
882 | 882 | | |
883 | | - | |
884 | 883 | | |
885 | 884 | | |
| 885 | + | |
886 | 886 | | |
| 887 | + | |
| 888 | + | |
887 | 889 | | |
888 | | - | |
889 | | - | |
| 890 | + | |
| 891 | + | |
890 | 892 | | |
891 | | - | |
| 893 | + | |
892 | 894 | | |
893 | 895 | | |
894 | 896 | | |
| |||
0 commit comments