You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
printf(" --allow-requantize: Allows requantizing tensors that have already been quantized. Warning: This can severely reduce quality compared to quantizing from 16bit or 32bit\n");
147
147
printf(" --leave-output-tensor: Will leave output.weight un(re)quantized. Increases model size but may also increase quality, especially when requantizing\n");
148
148
printf(" --pure: Disable k-quant mixtures and quantize all tensors to the same type\n");
149
149
printf(" --imatrix file_name: use data in file_name as importance matrix for quant optimizations\n");
150
+
printf(" --hide-imatrix: do not store imatrix details in the quantized model\n");
150
151
printf(" --include-weights tensor_name: use importance matrix for this/these tensor(s)\n");
151
152
printf(" --exclude-weights tensor_name: use importance matrix for this/these tensor(s)\n");
152
153
printf(" --output-tensor-type ggml_type: use this ggml_type for the output.weight tensor.\n");
0 commit comments