You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit add support for testing the ggml-cpu repack feature which
enables the repackaging of quantized data into more optimal layout for
matrix multiplication for specific hardware architectures.
The motivation is to enable the testing of a cpu backend that uses
repacked data against a reference cpu backend that does not use repacked
data.
Building:
```console
$ cmake -B build \
-DGGML_CPU_REF_BACKEND=ON
-DGGML_BACKEND_DL=ON \
-DGGML_CPU_ALL_VARIANTS=ON
```
List availble cpu architectures/variants:
```console
$ ./build/bin/test-backend-ops cpu-variants --list
CPU variants:
CPU-alderlake - 12th Gen Intel(R) Core(TM) i7-1260P
```
Run tests:
```console
./build-ref/bin/test-backend-ops cpu-variants \
--variant CPU-alderlake \
-o "MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=0,o=1)"
Testing CPU variant 'CPU-alderlake' against cpu-ref backend...
repack: repack tensor a with q4_0_8x8
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=0,o=1): OK
repack: repack tensor a with q4_0_8x8
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=0,o=1): OK
14491/14491 tests passed
```
All matrix multiplication tests can be run by use specifying
`-o "MUL_MAT"` but it may be harder to spot the ones that use repacking.
0 commit comments