Commit e55e95e
Smarterquant tests fixes (#7)
* Checkpoint: Refactor SmarterQuantTensorInfo and add headers
- Created C-compatible SmarterQuantTensorInfo in ggml-smarterquant-types.h
- Updated ggml.h, ggml-cpu.c, llama-quant.h, llama-quant.cpp,
llama-model-loader.cpp, and llama-model.cpp to use the new struct.
- Added missing C++ headers and forward declarations to llama-quant.cpp
in an attempt to resolve compilation errors.
Note: Codebase is not currently compiling due to issues in
llama-quant.cpp and an incorrect CMake build path used in the last
attempt. User will address compilation issues next.
* Fix compilation issues and implement SmarterQuant stubs
- Resolved various compilation errors in llama-quant.cpp related to includes, function definitions, and SmarterQuant logic.
- Implemented parsing for SmarterQuant JSON configuration in `load_smarter_quant_config`.
- Added a basic serial implementation for `llama_tensor_quantize_smarter_blocks`.
- Provided functional stubs for quantization helper functions within `llama-quant.cpp`.
- Ensured the public `llama_model_quantize` API correctly calls the implementation in `llama-quant.cpp`.
- Fixed a memory leak by adding a destructor to `llama_model` to free SmarterQuant permutation data.
- Verified that `ggml-cpu.c` and `llama-model.cpp` changes for SmarterQuant dequantization compile.
- The main library and all example tools now compile and link successfully.
* feat: Implement SmarterQuant numerical correctness tests
This commit introduces a new test suite for the SmarterQuant functionality
to verify the numerical correctness of the custom block quantization and
dequantization logic.
Key changes:
- Added `tests/test-smarterquant.cpp` with a test case that:
- Uses a sample F32 tensor with mixed quantization types (Q4_0, Q5_1, Q8_0, Q2_K).
- Applies column permutation.
- Quantizes using `llama_tensor_quantize_smarter_blocks`.
- Dequantizes using `ggml_get_rows_smarterquant`.
- Verifies the output against the original data.
- Updated `tests/CMakeLists.txt` to build the new test.
- Made `llama_tensor_quantize_smarter_blocks` in `src/llama-quant.cpp` non-static and added its declaration to `src/llama-quant.h`.
- Made `ggml_get_rows_smarterquant` in `ggml/src/ggml-cpu/ggml-cpu.c` non-static to allow direct testing.
- The implemented test passes, confirming the core CPU implementation of SmarterQuant (Tasks 1 and 2 from todo.txt) is working as expected for the tested scenario.
* feat: Implement SmarterQuant numerical correctness tests and update todo
This commit introduces a new test suite for the SmarterQuant functionality
to verify the numerical correctness of the custom block quantization and
dequantization logic. It also updates todo.txt to reflect this progress.
Key changes:
- Added `tests/test-smarterquant.cpp` with a test case that:
- Uses a sample F32 tensor with mixed quantization types (Q4_0, Q5_1, Q8_0, Q2_K).
- Applies column permutation.
- Quantizes using `llama_tensor_quantize_smarter_blocks`.
- Dequantizes using `ggml_get_rows_smarterquant`.
- Verifies the numerical output against the original F32 data.
- Updated `tests/CMakeLists.txt` to build the new test.
- Made `llama_tensor_quantize_smarter_blocks` in `src/llama-quant.cpp` non-static and added its declaration to `src/llama-quant.h`.
- Made `ggml_get_rows_smarterquant` in `ggml/src/ggml-cpu/ggml-cpu.c` non-static to allow direct testing by the new test suite.
- The implemented test passes, confirming the core CPU implementation of SmarterQuant (Tasks 1 and 2 from todo.txt) is working as expected for the tested scenario.
- Updated `todo.txt` to mark the CPU numerical correctness testing as DONE and outline further potential test enhancements.
---------
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>1 parent 6de9577 commit e55e95e
File tree
14 files changed
+1115
-1006
lines changed- ggml
- include
- src
- ggml-cpu
- src
- tests
14 files changed
+1115
-1006
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
347 | 347 | | |
348 | 348 | | |
349 | 349 | | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
350 | 356 | | |
351 | 357 | | |
352 | 358 | | |
| |||
605 | 611 | | |
606 | 612 | | |
607 | 613 | | |
| 614 | + | |
608 | 615 | | |
609 | | - | |
| 616 | + | |
610 | 617 | | |
611 | 618 | | |
612 | 619 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
8 | | - | |
9 | | - | |
10 | | - | |
11 | | - | |
12 | | - | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
13 | 20 | | |
14 | 21 | | |
15 | 22 | | |
| |||
9726 | 9733 | | |
9727 | 9734 | | |
9728 | 9735 | | |
| 9736 | + | |
| 9737 | + | |
| 9738 | + | |
9729 | 9739 | | |
9730 | 9740 | | |
9731 | 9741 | | |
| |||
9742 | 9752 | | |
9743 | 9753 | | |
9744 | 9754 | | |
9745 | | - | |
| 9755 | + | |
| 9756 | + | |
| 9757 | + | |
9746 | 9758 | | |
9747 | 9759 | | |
9748 | 9760 | | |
| |||
9757 | 9769 | | |
9758 | 9770 | | |
9759 | 9771 | | |
9760 | | - | |
9761 | | - | |
9762 | | - | |
9763 | | - | |
| 9772 | + | |
| 9773 | + | |
| 9774 | + | |
| 9775 | + | |
9764 | 9776 | | |
9765 | | - | |
| 9777 | + | |
9766 | 9778 | | |
9767 | | - | |
9768 | | - | |
9769 | | - | |
| 9779 | + | |
| 9780 | + | |
| 9781 | + | |
| 9782 | + | |
| 9783 | + | |
| 9784 | + | |
| 9785 | + | |
| 9786 | + | |
| 9787 | + | |
| 9788 | + | |
9770 | 9789 | | |
9771 | 9790 | | |
9772 | 9791 | | |
| |||
9865 | 9884 | | |
9866 | 9885 | | |
9867 | 9886 | | |
9868 | | - | |
9869 | | - | |
| 9887 | + | |
| 9888 | + | |
| 9889 | + | |
| 9890 | + | |
| 9891 | + | |
| 9892 | + | |
| 9893 | + | |
9870 | 9894 | | |
9871 | 9895 | | |
9872 | 9896 | | |
| |||
9880 | 9904 | | |
9881 | 9905 | | |
9882 | 9906 | | |
9883 | | - | |
9884 | | - | |
9885 | | - | |
9886 | | - | |
| 9907 | + | |
| 9908 | + | |
| 9909 | + | |
| 9910 | + | |
9887 | 9911 | | |
9888 | | - | |
| 9912 | + | |
9889 | 9913 | | |
9890 | | - | |
9891 | | - | |
9892 | | - | |
| 9914 | + | |
| 9915 | + | |
| 9916 | + | |
| 9917 | + | |
| 9918 | + | |
| 9919 | + | |
| 9920 | + | |
| 9921 | + | |
| 9922 | + | |
| 9923 | + | |
9893 | 9924 | | |
9894 | 9925 | | |
9895 | 9926 | | |
| |||
9899 | 9930 | | |
9900 | 9931 | | |
9901 | 9932 | | |
| 9933 | + | |
| 9934 | + | |
| 9935 | + | |
| 9936 | + | |
| 9937 | + | |
| 9938 | + | |
| 9939 | + | |
| 9940 | + | |
| 9941 | + | |
9902 | 9942 | | |
9903 | 9943 | | |
9904 | 9944 | | |
| |||
9923 | 9963 | | |
9924 | 9964 | | |
9925 | 9965 | | |
| 9966 | + | |
| 9967 | + | |
9926 | 9968 | | |
9927 | 9969 | | |
9928 | 9970 | | |
| |||
9933 | 9975 | | |
9934 | 9976 | | |
9935 | 9977 | | |
9936 | | - | |
| 9978 | + | |
9937 | 9979 | | |
9938 | 9980 | | |
9939 | 9981 | | |
| |||
13150 | 13192 | | |
13151 | 13193 | | |
13152 | 13194 | | |
| 13195 | + | |
| 13196 | + | |
| 13197 | + | |
| 13198 | + | |
| 13199 | + | |
| 13200 | + | |
| 13201 | + | |
| 13202 | + | |
| 13203 | + | |
| 13204 | + | |
| 13205 | + | |
| 13206 | + | |
| 13207 | + | |
| 13208 | + | |
| 13209 | + | |
| 13210 | + | |
| 13211 | + | |
| 13212 | + | |
| 13213 | + | |
| 13214 | + | |
| 13215 | + | |
| 13216 | + | |
| 13217 | + | |
| 13218 | + | |
| 13219 | + | |
| 13220 | + | |
| 13221 | + | |
| 13222 | + | |
| 13223 | + | |
| 13224 | + | |
| 13225 | + | |
| 13226 | + | |
| 13227 | + | |
| 13228 | + | |
| 13229 | + | |
| 13230 | + | |
| 13231 | + | |
| 13232 | + | |
| 13233 | + | |
| 13234 | + | |
| 13235 | + | |
| 13236 | + | |
| 13237 | + | |
| 13238 | + | |
| 13239 | + | |
| 13240 | + | |
| 13241 | + | |
| 13242 | + | |
| 13243 | + | |
| 13244 | + | |
| 13245 | + | |
| 13246 | + | |
| 13247 | + | |
| 13248 | + | |
| 13249 | + | |
| 13250 | + | |
| 13251 | + | |
| 13252 | + | |
| 13253 | + | |
| 13254 | + | |
| 13255 | + | |
13153 | 13256 | | |
13154 | 13257 | | |
13155 | 13258 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6510 | 6510 | | |
6511 | 6511 | | |
6512 | 6512 | | |
6513 | | - | |
| 6513 | + | |
6514 | 6514 | | |
6515 | 6515 | | |
6516 | 6516 | | |
6517 | | - | |
| 6517 | + | |
| 6518 | + | |
| 6519 | + | |
| 6520 | + | |
| 6521 | + | |
| 6522 | + | |
| 6523 | + | |
| 6524 | + | |
6518 | 6525 | | |
6519 | 6526 | | |
6520 | 6527 | | |
6521 | 6528 | | |
| 6529 | + | |
| 6530 | + | |
| 6531 | + | |
| 6532 | + | |
| 6533 | + | |
6522 | 6534 | | |
6523 | 6535 | | |
6524 | 6536 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
0 commit comments