You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -20,6 +20,7 @@ The parameters in square brackets are optional and have the following meaning:
20
20
*`-lv | --verbosity` specifies the verbosity level. If set to `0`, no output other than the perplexity of the processed chunks will be generated. If set to `1`, each time the results are saved a message is written to `stderr`. If `>=2`, a message is output each time data is collected for any tensor. Default verbosity level is `1`.
21
21
*`-o | --output-file` specifies the name of the file where the computed data will be stored. If missing `imatrix.gguf` is used.
22
22
*`-ofreq | --output-frequency` specifies how often the so far computed result is saved to disk. Default is 10 (i.e., every 10 chunks)
23
+
*`--output-format` specifies the output format of the generated imatrix file. Either "gguf", or "dat" (the legacy format). Defaults to "gguf".
23
24
*`--save-frequency` specifies how often to save a copy of the imatrix in a separate file. Default is 0 (i.e., never)
24
25
*`--process-output` specifies if data will be collected for the `output.weight` tensor. Typically, it is better not to utilize the importance matrix when quantizing `output.weight`, so this is set to `false` by default.
25
26
*`--in-file` one or more existing imatrix files to load and combine. Useful for merging files from multiple runs/datasets.
@@ -45,14 +46,19 @@ Recent versions of `llama-imatrix` store data in GGUF format by default. For the
45
46
46
47
```bash
47
48
# generate and save the imatrix using legacy format
0 commit comments