Commit 60ef23d
ggml-cpu: enable IBM NNPA Vector Intrinsics (#14317)
* ggml-cpu: add nnpa compile flag
Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 4a9f60c)
* ggml-cpu: add fp16->fp32 nnpa first
Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 8d4a798)
* ggml-cpu: add fp32->fp16
Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 0ff0d65)
* ggml-cpu: better variable names
Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 2f58bbc)
* docs: update s390x docs
Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 01b9294)
* ggml-cpu: add debugging prints to see if dlf16 is correct
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fix print vs printf
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fix float placeholder
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: ensure fp16 and fp32 load and stores are called
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fp16 load ensured to hit
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: remove sigint from fp16 store
for some reason, the function is not getting a hit when debugged with
gdb. we will need to investigate further
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: activate nnpa for ggml_cpu_fp16_to_fp32
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: nnpa activate ggml_cpu_fp16_to_fp32 for 8 elements
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: nnpa switch to vec_xst test
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: switch to vec_xst for 4 element loops also
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: rework noop
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: remove noop, general code cleanup
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: clarify variable naming
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: activate nnpa for ggml_cpu_fp32_to_fp16
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: add breakpoint for debugging
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: test fix for conversion failure
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: disable fp32->fp16 nnpa conversions for now
there are some conversion failures in nnpa that requires the eyes of an
ibm stsm. will create a separate pr to introduce the fp32->fp16 change.
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: switch to elif macro
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: reattempt fp32->fp16
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fix typo
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: reattempt fp32->fp16
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fix compiler types
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: change to typedef vector types
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: add 4 element loops for fp32->fp16
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: clarified vector naming
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: bring back fp32->fp16 store nnpa
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: activate nnpa fp32->fp16 or fp16->fp32 compute
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: add nnpa macro check in ggml-impl
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: add missing __func__
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: diagnose why __NNPA__ macro is not being defined
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: import vecintrin.h to fix compiler errors
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: update macro tests
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: move s390x typedef to own header file
Signed-off-by: Aaron Teo <[email protected]>
* Revert "ggml-cpu: move s390x typedef to own header file"
This reverts commit 157f856.
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: switch to importing ggml-cpu-impl instead
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fix macro declaration
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: test more macros
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: add debug prints
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: bruteforce macro definitions
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: move macro definitions
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: add ggml-impl.h to cmakelists
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: switch to private macros
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: move s390x typedef to own header file
Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 157f856)
* ggml-cpu: move things around
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: bring back compile macros
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: switch to quotes for import
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: add compiler error macro
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: add s390x detection in ggml-src
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: bring back compile definitions
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: undo cmakelists work
Signed-off-by: Aaron Teo <[email protected]>
* Revert "ggml-cpu: move s390x typedef to own header file"
This reverts commit 18d79e1.
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: remove typedefs.h
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: remove typedef from cmakelists
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: add ggml-impl.h future notes
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: add todo comment for future reference
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: clarify naming of dlf16
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: remove unnecessary target compile definitions
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: move nnpa fp16->fp32 and fp32->fp16 to simd-mappings
Signed-off-by: Aaron Teo <[email protected]>
* ggml: refactor fp32->fp16 and fp16->fp32 simd to ggml-cpu
Signed-off-by: Aaron Teo <[email protected]>
* docs: update broken huggingface link for s390x
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fix duplicate func names during compile
Signed-off-by: Aaron Teo <[email protected]>
* Revert "ggml-cpu: fix duplicate func names during compile"
This reverts commit fbb7334.
Signed-off-by: Aaron Teo <[email protected]>
* Revert "ggml: refactor fp32->fp16 and fp16->fp32 simd to ggml-cpu"
This reverts commit bd288e8.
Signed-off-by: Aaron Teo <[email protected]>
* ggml: refactor fp16<->fp32 simd to ggml-cpu
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fix missing simd-mappings.h import in quants.c
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fix missing simd-mappings.h within repack
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fix amx mmq missing simd-mappings.h
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: attempt at fixing loongarch failing build
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: move nnpa together with other fp16<->fp32 simd
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: fix wrong refactor of ggml-base
ref: #14317 (comment)
Signed-off-by: Aaron Teo <[email protected]>
* ggml: remove dependency on ggml-cpu from ggml-base
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: rename all fp16<->fp32 macros to prefix with ggml_cpu
ref: #14317 (comment)
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: remove mistaken fallback macro
fallback logic was already implemented but i was too sleepy to realise
Signed-off-by: Aaron Teo <[email protected]>
* ggml: move ggml_table_f32_f16 to ggml-cpu
ref: #14317 (comment)
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: move ggml_table_f32_f16 back to ggml-base due to ci failures
Signed-off-by: Aaron Teo <[email protected]>
* Revert "ggml-cpu: move ggml_table_f32_f16 back to ggml-base due to ci failures"
This reverts commit 32a3533.
Signed-off-by: Aaron Teo <[email protected]>
* Revert "ggml: move ggml_table_f32_f16 to ggml-cpu"
This reverts commit 9e40d98.
Signed-off-by: Aaron Teo <[email protected]>
* ggml: move ggml_table_f32_f16 to ggml-cpu
ref: #14317 (comment)
Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 9e40d98)
* ggml: move ggml_table_f32_f16 to ggml-cpu.c
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: extern c ggml_table_f32_f16 + chore docs
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: dedup ggml_table_f32_f16 from simd-mappings.h
we rely on the variable declaration in ggml-cpu.c instead
Signed-off-by: Aaron Teo <[email protected]>
* Revert "ggml-cpu: dedup ggml_table_f32_f16 from simd-mappings.h"
This reverts commit f71b21d.
Signed-off-by: Aaron Teo <[email protected]>
* ggml-cpu: bring back ggml_table_f32_f16
Signed-off-by: Aaron Teo <[email protected]>
* Revert "ggml-cpu: bring back ggml_table_f32_f16"
This reverts commit 2dce119.
Signed-off-by: Aaron Teo <[email protected]>
* fix ggml time initialization
* fix f32_f16 table init
* remove extra line
---------
Signed-off-by: Aaron Teo <[email protected]>
Co-authored-by: slaren <[email protected]>1 parent b193d53 commit 60ef23d
File tree
29 files changed
+996
-853
lines changed- docs
- ggml
- include
- src
- ggml-cpu
- amx
- arch
- arm
- loongarch
- powerpc
- riscv
- s390
- wasm
- x86
- llamafile
29 files changed
+996
-853
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
32 | | - | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| |||
41 | 42 | | |
42 | 43 | | |
43 | 44 | | |
44 | | - | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
45 | 58 | | |
46 | 59 | | |
47 | 60 | | |
48 | 61 | | |
49 | 62 | | |
50 | 63 | | |
51 | | - | |
52 | 64 | | |
53 | 65 | | |
54 | 66 | | |
55 | | - | |
| 67 | + | |
56 | 68 | | |
57 | 69 | | |
58 | 70 | | |
| |||
70 | 82 | | |
71 | 83 | | |
72 | 84 | | |
73 | | - | |
| 85 | + | |
74 | 86 | | |
75 | 87 | | |
76 | 88 | | |
| |||
101 | 113 | | |
102 | 114 | | |
103 | 115 | | |
| 116 | + | |
104 | 117 | | |
105 | 118 | | |
106 | 119 | | |
107 | 120 | | |
108 | 121 | | |
109 | 122 | | |
| 123 | + | |
110 | 124 | | |
111 | 125 | | |
112 | 126 | | |
113 | 127 | | |
114 | 128 | | |
115 | 129 | | |
116 | | - | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
117 | 133 | | |
118 | | - | |
| 134 | + | |
119 | 135 | | |
120 | | - | |
| 136 | + | |
121 | 137 | | |
122 | | - | |
| 138 | + | |
123 | 139 | | |
124 | | - | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
125 | 143 | | |
126 | 144 | | |
127 | 145 | | |
| |||
154 | 172 | | |
155 | 173 | | |
156 | 174 | | |
157 | | - | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
557 | 557 | | |
558 | 558 | | |
559 | 559 | | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
560 | 564 | | |
561 | 565 | | |
562 | 566 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
131 | 131 | | |
132 | 132 | | |
133 | 133 | | |
| 134 | + | |
134 | 135 | | |
135 | 136 | | |
136 | 137 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
| 104 | + | |
104 | 105 | | |
105 | 106 | | |
106 | 107 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
448 | 448 | | |
449 | 449 | | |
450 | 450 | | |
| 451 | + | |
451 | 452 | | |
452 | 453 | | |
453 | 454 | | |
| |||
464 | 465 | | |
465 | 466 | | |
466 | 467 | | |
| 468 | + | |
467 | 469 | | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
468 | 476 | | |
469 | 477 | | |
470 | 478 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| |||
453 | 454 | | |
454 | 455 | | |
455 | 456 | | |
456 | | - | |
| 457 | + | |
457 | 458 | | |
458 | 459 | | |
459 | 460 | | |
| |||
1090 | 1091 | | |
1091 | 1092 | | |
1092 | 1093 | | |
1093 | | - | |
| 1094 | + | |
1094 | 1095 | | |
1095 | 1096 | | |
1096 | 1097 | | |
| |||
1113 | 1114 | | |
1114 | 1115 | | |
1115 | 1116 | | |
1116 | | - | |
1117 | | - | |
| 1117 | + | |
| 1118 | + | |
1118 | 1119 | | |
1119 | 1120 | | |
1120 | 1121 | | |
| |||
1137 | 1138 | | |
1138 | 1139 | | |
1139 | 1140 | | |
1140 | | - | |
| 1141 | + | |
1141 | 1142 | | |
1142 | 1143 | | |
1143 | 1144 | | |
| |||
1437 | 1438 | | |
1438 | 1439 | | |
1439 | 1440 | | |
1440 | | - | |
| 1441 | + | |
1441 | 1442 | | |
1442 | 1443 | | |
1443 | 1444 | | |
| |||
1498 | 1499 | | |
1499 | 1500 | | |
1500 | 1501 | | |
1501 | | - | |
1502 | | - | |
| 1502 | + | |
| 1503 | + | |
1503 | 1504 | | |
1504 | 1505 | | |
1505 | 1506 | | |
| |||
1571 | 1572 | | |
1572 | 1573 | | |
1573 | 1574 | | |
1574 | | - | |
| 1575 | + | |
1575 | 1576 | | |
1576 | 1577 | | |
1577 | 1578 | | |
| |||
0 commit comments