Skip to content

Commit 8d0e276

Browse files
committed
Update README.md
1 parent 8f1aa78 commit 8d0e276

File tree

1 file changed

+1
-9
lines changed

1 file changed

+1
-9
lines changed

tools/imatrix/README.md

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ More information is available in <https://github.com/ggml-org/llama.cpp/pull/486
1010
-m model.gguf -f some-text.txt [-o imatrix.gguf] [--output-format {gguf,dat}] [--no-ppl] \
1111
[--process-output] [--chunk 123] [--save-frequency 0] [--output-frequency 10] \
1212
[--in-file imatrix-prev-0.gguf --in-file imatrix-prev-1.gguf ...] [--parse-special] \
13-
[--output-format gguf|dat] [--activation-statistics] [--show-statistics] [...]
13+
[--output-format gguf|dat] [--show-statistics] [...]
1414
```
1515

1616
Here `-m | --model` with a model name and `-f | --file` with a file containing calibration data (such as e.g. `wiki.train.raw`) are mandatory.
@@ -29,7 +29,6 @@ The parameters in square brackets are optional and have the following meaning:
2929
* `--chunks` maximum number of chunks to process. Default is `-1` for all available chunks.
3030
* `--no-ppl` disables the calculation of perplexity for the processed chunks. Useful if you want to speed up the processing and do not care about perplexity.
3131
* `--show-statistics` displays imatrix file's statistics.
32-
* `--activation-statistics` enables the collection of activation statistics for each tensor. If set, the imatrix file size will double, but reported statistics will be more accurate.
3332

3433
For faster computation, make sure to use GPU offloading via the `-ngl | --n-gpu-layers` argument.
3534

@@ -70,20 +69,13 @@ Versions **b5942** and newer of `llama-imatrix` store data in GGUF format by def
7069
./llama-imatrix -m ggml-model-f16.gguf -f calibration-data.txt --chunk 5 --output-frequency 20 --save-frequency 50 --parse-special
7170
```
7271

73-
```bash
74-
# generate imatrix and enable activation-based statistics
75-
./llama-imatrix -m ggml-model-f16.gguf -f calibration-data.txt --activation-statistics -ngl 99
76-
```
77-
7872
```bash
7973
# analyse imatrix file and display summary statistics instead of running inference
8074
./llama-imatrix --in-file imatrix.gguf --show-statistics
8175
```
8276

8377
## Statistics
8478

85-
For current versions of `llama-imatrix`, the `--show-statistics` option has two modes of operation: If `--activation-statistics` was used to generate the imatrix and `--output-format` was set to `gguf`, precise activations statistics will be calculated. Otherwise, it will report less accurate, albeit still useful, metrics based on average squared activations.
86-
8779
#### Per tensor
8880

8981
* **Σ(Act²)** *(legacy mode)* / **L₂ Norm** *(preferred)*: If in legacy mode, the raw sum of squares of activations (sum of `Act²`). In preferred mode, the Euclidean Distance (L₂ Norm) between this tensor’s average activations and those of the previous layer.

0 commit comments

Comments
 (0)