File tree Expand file tree Collapse file tree 1 file changed +3
-4
lines changed Expand file tree Collapse file tree 1 file changed +3
-4
lines changed Original file line number Diff line number Diff line change @@ -2032,10 +2032,9 @@ def get_type(val: Any) -> GGUFValueType:
20322032 GGMLQuantizationType .TQ1_0 : (256 , 2 + 4 * 13 ),
20332033 GGMLQuantizationType .TQ2_0 : (256 , 2 + 64 ),
20342034 # Currently, we use tricks here
2035- # - The block size doesn't include scales or zero_points as group_size is changeable
2036- # - So the size is slightly smaller than the real size
2037- # - The n_bytes in gguf_reader.py is thus inaccurate
2038- # - During inference, the accurate nbytes info will be known through ggml_tmac_get_nbytes
2035+ # - Bitnet-style models have only one scale value for the whole tensor,
2036+ # - which is not compatible with the "blocking" philosophy of here.
2037+ # - During inference, the accurate nbytes info will be known through ggml_tmac_get_nbytes.
20392038 GGMLQuantizationType .TMAC_BN_0 : (64 , 64 * 2 // 8 ),
20402039 GGMLQuantizationType .TMAC_W2G64_0 : (64 , 4 + 64 * 2 // 8 ),
20412040 GGMLQuantizationType .TMAC_W2G64_1 : (64 , 4 + 4 + 64 * 2 // 8 ),
You can’t perform that action at this time.
0 commit comments