You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here `-m | --model` with a model name and `-f | --file` with a file containing calibration data (such as e.g. `wiki.train.raw`) are mandatory.
@@ -29,7 +29,6 @@ The parameters in square brackets are optional and have the following meaning:
29
29
*`--chunks` maximum number of chunks to process. Default is `-1` for all available chunks.
30
30
*`--no-ppl` disables the calculation of perplexity for the processed chunks. Useful if you want to speed up the processing and do not care about perplexity.
*`--activation-statistics` enables the collection of activation statistics for each tensor. If set, the imatrix file size will double, but reported statistics will be more accurate.
33
32
34
33
For faster computation, make sure to use GPU offloading via the `-ngl | --n-gpu-layers` argument.
35
34
@@ -70,20 +69,13 @@ Versions **b5942** and newer of `llama-imatrix` store data in GGUF format by def
For current versions of `llama-imatrix`, the `--show-statistics` option has two modes of operation: If `--activation-statistics` was used to generate the imatrix and `--output-format` was set to `gguf`, precise activations statistics will be calculated. Otherwise, it will report less accurate, albeit still useful, metrics based on average squared activations.
86
-
87
79
#### Per tensor
88
80
89
81
***Σ(Act²)***(legacy mode)* / **L₂ Norm***(preferred)*: If in legacy mode, the raw sum of squares of activations (sum of `Act²`). In preferred mode, the Euclidean Distance (L₂ Norm) between this tensor’s average activations and those of the previous layer.
0 commit comments