Skip to content

Commit 89051cd

Browse files
committed
Update README.md
1 parent dcac206 commit 89051cd

File tree

1 file changed

+8
-2
lines changed

1 file changed

+8
-2
lines changed

tools/imatrix/README.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ More information is available in <https://github.com/ggml-org/llama.cpp/pull/486
1010
-m model.gguf -f some-text.txt [-o imatrix.gguf] [--output-format {gguf,dat}] [--no-ppl] \
1111
[--process-output] [--chunk 123] [--save-frequency 0] [--output-frequency 10] \
1212
[--in-file imatrix-prev-0.gguf --in-file imatrix-prev-1.gguf ...] [--parse-special] \
13-
[--show-statistics] [...]
13+
[--activation-statistics] [--show-statistics] [...]
1414
```
1515

1616
Here `-m | --model` with a model name and `-f | --file` with a file containing calibration data (such as e.g. `wiki.train.raw`) are mandatory.
@@ -29,6 +29,7 @@ The parameters in square brackets are optional and have the following meaning:
2929
* `--chunks` maximum number of chunks to process. Default is `-1` for all available chunks.
3030
* `--no-ppl` disables the calculation of perplexity for the processed chunks. Useful if you want to speed up the processing and do not care about perplexity.
3131
* `--show-statistics` displays imatrix file's statistics.
32+
* `--activation-statistics` enables the collection of activation statistics for each tensor. If set, the imatrix file size will double, but reported statistics will be more accurate.
3233

3334
For faster computation, make sure to use GPU offloading via the `-ngl | --n-gpu-layers` argument.
3435

@@ -69,14 +70,19 @@ Versions **b5942** and newer of `llama-imatrix` store data in GGUF format by def
6970
./llama-imatrix -m ggml-model-f16.gguf -f calibration-data.txt --chunk 5 --output-frequency 20 --save-frequency 50 --parse-special
7071
```
7172

73+
```bash
74+
# generate imatrix and enable activation-based statistics
75+
./llama-imatrix -m ggml-model-f16.gguf -f calibration-data.txt --activation-statistics -ngl 99
76+
```
77+
7278
```bash
7379
# analyse imatrix file and display summary statistics instead of running inference
7480
./llama-imatrix --in-file imatrix.gguf --show-statistics
7581
```
7682

7783
## Statistics
7884

79-
From version <bwxyz>, `--show-statistics` operates in two modes: for GGUF (preferred) imatrices, it reports direct and accurate activation statistics, and for legacy (binary) files, it reports the less precise average squared activations.
85+
Beginning with version <bwxyz>, `--show-statistics` has two modes. If `--activation-statistics` was used at imatrix creation time and `--output-format` was set to `gguf`, it reports precise statistics. Otherwise, it reports less accurate, albeit still useful, metrics based on average squared activations.
8086

8187
#### Per tensor
8288

0 commit comments

Comments
 (0)